manual/manual.tex

   1
   2 \documentclass[11pt]{article}
   3
   4 \usepackage{calc}
   5 \usepackage{graphics}
   6 %\usepackage{latex8}
   7 \usepackage{times}
   8 \usepackage{epsf}
   9 \usepackage{epsfig}
  10 \usepackage{graphicx}
  11 \usepackage{changebar}
  12 \usepackage{portland}
  13 \usepackage{lscape}
  14
  15 \setlength{\textheight}{8.50in}
  16 \setlength{\textwidth}{6.5in}
  17 \setlength{\topmargin}{-0.3in}
  18 %\setlength{\leftmargin}{2.9in}
  19 %\setlength{\rightmargin}{-2.9in}
  20 \setlength{\oddsidemargin}{0in}
  21 \setlength{\parindent}{0.5in}
  22 \setlength\parindent{0in}
  23 \setlength\parskip{0.1in}
  24
  25 \newcommand{\note}[1]{{$\rightarrow$ \bf Note: \emph{#1}}}
  26
  27 \begin{document}
  28
  29 \title{
  30 \includegraphics[height=1.5in]{v3vee.pdf}
  31 \includegraphics[height=1.5in]{logo6.png} \\
  32 \vspace{0.5in}
  33 Palacios Internal and External Developer Manual
  34 }
  35
  36
  37 \maketitle
  38
  39
  40 This manual is written for internal and external Palacios
  41 developers. It contains information on how to obtain the Palacios code
  42 base, how to go about the development process, and how to commit those
  43 changes to the mainline source tree.  This assumes that the reader has
  44 read the technical report {\em An Introduction to the Palacios Virtual
  45 Machine Monitor -- Release 1.0}\footnote{It's important to note that
  46 there have been substantial changes in the build process from 1.0 to
  47 1.2 and beyond.  Hence, the technical report is primarily useful as an
  48 explanation of the theory of operation of Palacios.  This document is
  49 the one to consult for the build process.}  and also has a slight
  50 working knowledge of {\em git}.  You will also want to read the
  51 document {\em Building a bootable guest image for Palacios and Kitten}
  52 in order to understand how to build an extremely lightweight guest
  53 environment, suitable for testing.  If you want to configure network
  54 booting for testing on real hardware, you'll want to read the document
  55 {\em Booting Palacios/Kitten Over the Network Using PXE}.
  56
  57 Please note that Palacios and Kitten are under rapid development,
  58 hence this manual may very well be out of date!
  59
  60 \newpage
  61 \tableofcontents
  62 \newpage
  63 \listoffigures
  64 \newpage
  65
  66 \section{Overview}
  67
  68
  69 Both Palacios and Kitten follow a hybrid development process that
  70 uses both the centralized repository and distributed development
  71 models. A central repository exists that holds the master version of
  72 the code base. This central repository is cloned by multiple people
  73 and in multiple places to support various development efforts. A
  74 feature of \texttt{git} is that every developer actually has a full copy of
  75 the entire repository, and so can function independently until such
  76 time as they need to re-sync with the master version.
  77
  78 There are typically multiple levels of access to the central
  79 repository, that are granted based on the type of developer being
  80 granted access. The three basic developer types and their access
  81 privileges are:
  82
  83 \begin{itemize}
  84 \item Core developers: These are the lead developers and are in
  85 charge of managing the master repository. They have full read/write
  86 access permissions to the central repository.
  87
  88 \item Internal developers: Formal members of the development
  89 team. These people are capable of pulling directly from the central
  90 repository, but lack the ability to write directly to it.
  91
  92 \item External developers: People who are not actual members of the
  93 development team. These people can only access the public repository
  94 which is only updated to contain the release versions.
  95 \end{itemize}
  96
  97 Students doing independent study or REUs related to Palacios are set
  98 up as internal developers.  EECS 441 (Resource Virtualization)
  99 students are generally either set up as internal or external
 100 developers, depending on their projects.
 101
 102 Because internal and external developers cannot write directly to the
 103 master repository, they need to first submit their changes to a core
 104 developer before they can be added to the mainline. We will discuss
 105 that process in Section~\ref{sec:submission}.
 106
 107
 108 \section{Checking out Palacios}
 109
 110 The central Palacios repository is located on {\em
 111 newskysaw.cs.northwestern.edu} in {\em /home/palacios/palacios}. All
 112 internal developers have read access to the directory. Each developer
 113 must create their own local version of the repository, this is done
 114 with {\em git clone}.
 115
 116 \begin{verbatim}
 117 git clone /home/palacios/palacios
 118 \end{verbatim}
 119
 120 On the machine {\em newbehemoth.cs.northwestern.edu} you will want to
 121 use the following command instead. The {\em newskysaw} home
 122 directories are NFS-mounted on /home-remote.
 123
 124 \begin{verbatim}
 125 git clone /home-remote/palacios/palacios
 126 \end{verbatim}
 127
 128
 129 On any other machine, you can clone the repository via ssh, provided
 130 you have a newskysaw account:
 131
 132 \begin{verbatim}
 133 git clone ssh://you@newskysaw.cs.northwestern.edu//home/palacios/palacios
 134 \end{verbatim}
 135
 136 External developers can clone the public repository via the web.  The
 137 public repository tracks the release and main development branch
 138 (e.g., devel) of the internal repository with a 30 minute delay.  The
 139 access information is available on the web site (http://v3vee.org).
 140 The web site also includes a publically accessible gitweb interface to
 141 allow browsing of the public repository.  The clone command looks like
 142
 143 \begin{verbatim}
 144 git clone http://v3vee.org/palacios/palacios.web/palacios.git
 145 \end{verbatim}
 146
 147 No matter how you clone, the clone command creates a local copy of the
 148 repository at {\em ./palacios/}.
 149
 150 Note that both {\em newskysaw} and {\em newbehemoth} have all the
 151 tools installed that are needed to build and test Palacios and Kitten.
 152 If you develop on another machine, you will need to set those tools up
 153 for yourself.  This isn't hard and the tools are all free.  See the
 154 technical report for what tools you will need.
 155
 156 When you first clone the repository, you will get the {\em master}
 157 branch, which is used to generate releases.   All development work is
 158 done in the {\em devel} branch of the repository. The developer can
 159 access this branch via:
 160
 161 \begin{verbatim}
 162 git checkout --track -b devel origin/devel
 163 \end{verbatim}
 164
 165 or
 166
 167 \begin{verbatim}
 168 /opt/vmm-tools/bin/checkout_branch devel
 169 \end{verbatim}
 170
 171
 172 {\em Important:} Note that Palacios is very actively developed so the
 173 contents of the {\em devel} branch are frequently changing (and
 174 broken!). In order to keep up to date with the latest version, it is
 175 necessary to periodically pull the latest changes from the master
 176 repository by running \verb.git pull..
 177
 178
 179 The released versions of Palacios are, currently, 1.0, 1.1, and 1.2.
 180 To switch to the current release branch, execute
 181
 182 \begin{verbatim}
 183 git checkout --track -b Release-1.2 origin/Release-1.2
 184 \end{verbatim}
 185
 186 or
 187
 188 \begin{verbatim}
 189 /opt/vmm-tools/bin/checkout_branch Release-1.2
 190 \end{verbatim}
 191
 192
 193
 194
 195 \section{Checking out Kitten}
 196
 197 Kitten is available from Sandia National Labs, and is the main host OS
 198 we are targeting with Palacios. Loosely speaking, core Palacios
 199 developers are internal Kitten developers, and internal Palacios
 200 developers are external Kitten developers. The public repository for
 201 Kitten is at {\em http://code.google.com/p/kitten}.  To simplify
 202 things, we are maintaining a local mirror copy on newskysaw in {\em
 203 /home/palacios/kitten} that tracks the public repository.
 204
 205 Kitten uses Mercurial for source management, so you will have to make
 206 sure the local Mercurial version is configured correctly.
 207 Specifically you will probably need to add something like the
 208 following Python path to your shell environment.
 209
 210 \begin{verbatim}
 211 export PYTHONPATH=/usr/local/lib64/python2.4/site-packages/
 212 \end{verbatim}
 213
 214 You can then clone Kitten from the local mirror.   On {\em newskysaw},
 215 run:
 216 \begin{verbatim}
 217 hg clone /home/palacios/kitten
 218 \end{verbatim}
 219 On {\em newbehemoth}, run
 220 \begin{verbatim}
 221 hg clone /home-remote/palacios/kitten
 222 \end{verbatim}
 223 On other machines, run
 224 \begin{verbatim}
 225 hg clone ssh://you@newskysaw.cs.northwestern.edu//home/palacios/kitten
 226 \end{verbatim}
 227 External developers, run
 228 \begin{verbatim}
 229 hg clone https://kitten.googlecode.com/hg/ kitten
 230 \end{verbatim}
 231
 232 Both the Kitten and Palacios clone commands should be run from the
 233 same directory. This means that both repositories should be located at
 234 the same directory level. The Kitten build process depends on this.
 235
 236 {\em Important:} Like Palacios, Kitten is under active development,
 237 and its source tree is frequently changing. In order to keep up to
 238 date with the latest version, it is necessary to periodically pull the
 239 latest changes from the mirror repository by running \verb.hg pull.
 240 followed by \verb.hg update..
 241
 242 The current release of Kitten, which will work correctly with the current 1.2 release of Palacios is 1.2.0.
 243 To switch to the current release branch, execute
 244
 245 \begin{verbatim}
 246 hg checkout release-1.2.0
 247 \end{verbatim}
 248
 249
 250 \section{Compiling Palacios}
 251
 252 The Palacios build process has been changed recently from a homegrown
 253 environment to the widely used KBuild environment.  KBuild is also
 254 used for building Kitten.  Because KBuild is the build environment
 255 used for Linux, much of what you learn about configuring and building
 256 Linux kernels is readily applicable to Palacios and Kitten.
 257
 258 The output of the Palacios build process is a static library that
 259 includes the Palacios VMM and relevant guest support code blocks. This
 260 static library is then linked into a host operating system. Palacios
 261 internally supports GeekOS and can generate a complete OS image via a
 262 unified build process.  By complete OS image, we mean an ISO image
 263 containing GeekOS, Palacios, and a guest image (another ISO image) for
 264 testing.
 265
 266 These days, however, Palacios is typically embedded into Kitten, not
 267 GeekOS.  To combine Palacios with Kitten, it is necessary to first
 268 configure and build Palacios, then to configure and build Kitten,
 269 linking in Palacios.   Kitten can also be configured to link in a
 270 guest image for testing.
 271
 272 \subsection{Configuration}
 273
 274 To configure Palacios, enter the top level Palacios directory and
 275 execute:
 276 \begin{verbatim}
 277 make clean
 278 make xconfig
 279 \end{verbatim}
 280 At this point, you will see be presented with a KBuild configuration
 281 screen, similar to what you would see in configuring a Linux kernel.
 282 Palacios has far fewer options, however.   If you don't have X or
 283 don't want the graphical configuration system, you can also use the
 284 \verb.menuconfig. or \verb.config. targets.  The available options
 285 change over time, so we do not cover all of them here, but here are a
 286 few that are usually important.  We note how to set these options to
 287 configure a minimal VMM in the following.
 288 \begin{itemize}
 289 \item Target Configuration:
 290 \begin{itemize}
 291
 292 \item Red Storm (Cray XT3/XT4) --- turn on to target Cray XT4
 293 supercomputers.  (off)
 294 \item AMD SVM Support --- targets AMD processors with the SVM hardware
 295 virtualization features (on)
 296 \item Intel VMX support --- targets Intel processors with the VMX
 297 hardware virtualization features (on)
 298 \item Compile for a multi-threaded OS (on)
 299 \item Enable VMM telemetry support --- this is lightweight logging and
 300 data collection (on)
 301 \item Enable VMM instrumentation --- this is heavyweight logging and
 302 data collection (off)
 303 \item Enable passthrough video --- this lets a guest write directly to
 304 the video card (off)  (this is outdated and handled in a different way now)
 305 \item Enable experimental options --- this makes it possible to select
 306 features that are under current development (off).  You probably want
 307 to leave leave this all off.  The VNET suboption is for an experimental VMM-embedded
 308 overlay network under development by Lei Xia and Yuan Tang.
 309 \item Enable built-in versions of stdlib functions --- this adds
 310 needed stdlib functions that the host OS may not supply.  For use with
 311 Kitten turn on and enable strcasecmp() and atoi().
 312 \item Enable built-in versions of stdio functions (off)
 313 \end{itemize}
 314 \item Symbiotic Functions (these are experimental options for Jack
 315 Lange's thesis).
 316 \begin{itemize}
 317 \item Enable Symbiotic Functionality --- This adds symbiotic features
 318 to Palacios, specifically support for discovery and configuration by
 319 symbiotic guests, the SymSpy passive information interface for
 320 asynchronous symbiotic guest $\leftrightarrow$ symbiotic VMM
 321 information flow, and the SymCall functional interface for synchronous
 322 symbiotic VMM $\rightarrow$ symbiotic guest upcalls.  (off)
 323 \item Symbiotic Swap --- Enables the SwapBypass symbiotic service for
 324 symbiotic Linux guests.  (off)
 325 \end{itemize}
 326 \item Debug Configuration
 327 \begin{itemize}
 328 \item Compile with Debug info --- adds debug symbols (-g)  (off)
 329 \item Enable Debugging --- makes it possible to show PrintDebug output
 330 (on).   You can selectively turn on debugging output for each major
 331 VMM component, including shadow paging, nested paging, control
 332 registers, interrupts, I/O, instruction emulation and XED, halt, and
 333 the device manager.   Note that the more debugging output you turn on,
 334 the slower the VMM will go since it will have to wait for the prints
 335 to finish.
 336 \end{itemize}
 337 \item BIOS Selection --- Lets you select which code blobs will be
 338 used for bootstrapping the guest.   There are currently three:  a
 339 BIOS, a Video BIOS, and the VMXAssist V8086 service (the latter is used only on
 340 Intel VMX).   Generally, you should not need to change these.
 341 \item Virtual Devices --- virtual devices can be instantiated and
 342 added to a guest.   The following is a list of the currently
 343 implemented virtual devices.
 344 \begin{itemize}
 345 \item BOCHS Debug Console Device --- used for debugging output from
 346 the guest BIOS (on)
 347 \item OS Debug Console Device --- used for debugging output from
 348 the guest kernel (on)
 349 \item 8259A PIC - legacy Programmable Interrupt Controller chip --- used
 350 for bootstrap of most guests (on)
 351 \item APIC - in-processor Advanced Programmable Interrupt Controller --- used
 352 for interrupt delivery on almost all guests (on)
 353 \item IOAPIC - Off-chip APIC  --- used for interrupt deliver for almost
 354 all guests (on)
 355 \item PIT - legacy 8254 timer (on)
 356 \item i440fx Northbridge --- emulation of a typical PC North Bridge
 357 chip, used on almost all guests (on)
 358 \item PIIX3 Southbridge --- emulation of a typical PC South Bridge
 359 chip, used on almost all guests (on)
 360 \item PCI --- emulation of a PCI bus - needed for attaching most
 361 devices (on)
 362 \begin{itemize}
 363 \item Passthrough PCI --- allows us to make a hardware PCI device visible and
 364 directly accessible by the guest (off)
 365 \end{itemize}
 366 \item Generic --- this is a run-time configurable device that can intercept I/O port read/writes and memory region reads/writes.   Intercepted reads and writes can either be ignored or forwarded to actual hardware, and the data flow can optionally be printed.   This is a useful tool with at least three purposes.  First, it makes it possible to ``stub out'' hardware that isn't currently implemented and for which we don't want to allow passthrough access. Second, it makes it possible to provide low-level passthrough access to physical hardware.   Third, it can be used to spy on guest/device interactions, which is very helpful when trying to understand the interface of a device.
 367 \item NVRAM - motherboard configuration memory --- needed by BIOS bootstrap (on)
 368 \item Keyboard - Generic PS/2 keyboard, including mostly broken mouse
 369 implementation (on)
 370 \item IDE --- Support for virtual IDE controllers that support disks
 371 and CD ROMs (on)
 372
 373 \item NE2K - NE2000 and RTL8139 network devices (off)
 374 \item CGA - CGA video card (paritial implementation) (off)
 375 \begin{itemize}
 376 \item Telnet Virtual Console (off)   When CGA and Telnet Console are
 377 on, it is possible to telnet to the console of the guest.   Eventually
 378 the rest of this will provide simple bitmapped video console for VNC
 379 access.
 380 \end{itemize}
 381 \item RAMDISK storage backend --- used to create RAM disk
 382 implementations of block devices (on)
 383 \item NETDISK storage backend --- used to create network-attached disk
 384 implementations of block devices, e.g., network block devices (off)
 385 \item TMPDISK storage backend --- used to create temporary storage
 386 implementations of block devices (on)
 387 \item Linux Virtio Balloon Device --- used for memory ballooning by
 388 Linux virtio-compatible guests (off)
 389 \item Linux Virtio Block Device --- used for fast block device support
 390 by Linux virtio-compatible guests (off)
 391 \item Linux Virtio Network Device --- used for fast network device support
 392 by Linux virtio-compatible guests (off)
 393 \item Linux Virtio Symbiotic Device (off)
 394 \item Symbiotic Swap Disk (multiple versions) --- used for the
 395 SwapBypass service (off)
 396 \item Disk Performance Model --- used for the
 397 SwapBypass service (off)
 398 \end{itemize}
 399 \end{itemize}
 400
 401 \subsection{Compilation}
 402
 403 After configuring Palacios---remember to save your changes---you can
 404 compile it by executing
 405 \begin{verbatim}
 406 make
 407 \end{verbatim}
 408 This will produce the file {\em libv3vee.a} in the current directory
 409 This static library contains the Palacios VMM and is ready for
 410 embedding into an OS, such as Kitten.  The library provides the
 411 ability to instantiate and run virtual machines.  By default, on a 64
 412 bit machine, the library is compiled for 64 bit machines (x86\_64),
 413 while on a 32 bit machine, it is compiled for 32 bit machines.  You
 414 can override this using the ARCH=i386 or ARCH=x86\_64 arguments to the
 415 make, provided you have the relevant tools available.  The 64 bit
 416 version is what you need for use with Kitten.  A 64 bit Palacios can
 417 run both 64 and 32 bit guests.  Both {\em newskysaw} and {\em
 418 newbehemoth} are 64 bit machines.
 419
 420 The compilation process will also create the utility {\em build\_vm}
 421 (which builds guest images from XML description files), and a very
 422 simple guest image called {\em guest\_os.img} that essentially contains
 423 a Linux kernel and BusyBox.  The default Kitten configuration will use
 424 this guest image.
 425
 426
 427 \section{Compiling Kitten}
 428 Kitten requires a 64-bit version of Palacios, so make sure that
 429 Palacios has been correctly compiled before compiling Kitten.  The
 430 current default for Palacios is 64 bit.
 431
 432 \subsection{Configuration}
 433 Kitten borrows a lot of concepts from Linux, including the Linux build
 434 process. As such it must be configured before it is actually compiled.
 435 The Kitten configuration process is the same as Linux, and can be
 436 accessed via any of these make targets.
 437 \begin{itemize}
 438 \item \verb.make xconfig.
 439 \item \verb.make config.
 440 \item \verb.make menuconfig.
 441 \end{itemize}
 442
 443 Of course, there are a range of configuration options.  In the
 444 following, we note only the most important.  The indicated values are
 445 defaults for the simplest interaction between Kitten and Palacios.
 446 \begin{itemize}
 447 \item Target Configuration
 448 \begin{itemize}
 449 \item System Architecture --- you probably want to set this to
 450 PC-Compatible, unless you are working on Red Storm.
 451 \item Processor Family --- you want to set this to either
 452 AMD-Opteron/Athlon64 or Intel-64/Core2, depending on whether you have
 453 a 64 bit AMD or 64 bit Intel processor.
 454 \end{itemize}
 455 \item Virtualization
 456 \begin{itemize}
 457 \item Include Palacios VMM --- this will link against the Palacios
 458 library (on)
 459 \item Path to pre-built Palacios tree --- directory where libv3vee.a
 460 can be found.
 461 \item Path to guest image --- location of the test guest OS
 462 image that will be embedded.  We will say more about this later.
 463 Essentially, however, a guest image consists of a blob that begins
 464 with an XML description of the desired guest environment and the
 465 contents of the remainder of the blob.  The remainder of the blob
 466 usually contains disk or cd images.  The default path is
 467 ../palacios/guest\_os.img, where it will find the simple guest created
 468 during the Palacios build process.
 469 \end{itemize}
 470 \item Networking
 471 \begin{itemize}
 472 \item Enable LWIP TCP/IP stack.  This activates a simple TCP/IP stack
 473 that things like NETDISK can use. (off)
 474 \end{itemize}
 475 \item Device Drivers
 476 \begin{itemize}
 477 \item VGA Console --- driver for basic video. (on)
 478 \item Serial Console --- driver for serial port console.  (on)
 479 \item VM Console --- driver for Kitten console on top of Palacios.  If
 480 Kitten is run {\em as a guest}, and it has VM Console on, then it can output
 481 cleanly via the Palacios OS Console device (on).
 482 \item NE2K Device Driver --- driver for NE2K and RTL8139 network cards
 483 (off)
 484 \item VM Network Driver --- driver for Kitten network output using
 485 Palacios.  If Kitten is run {\em as a guest}, and it has VM Network
 486 Driver, then it can send and receive packets using the Palacios Linux
 487 virtio network device.  (off)
 488 \end{itemize}
 489 \item ISOIMAGE configuration:  Kitten kernel arguments.   Note that
 490 this is NOT for the guest image, but rather for the Kitten image.  You
 491 can leave this alone.  For Palacios operation, it's important that the
 492 option \verb.console=serial. appears.  If the NE2K/RTL8139 driver
 493 should be used \verb.net=rtl8139. should appear.
 494 \item Kernel Hacking
 495 \begin{itemize}
 496 \item Kernel Debugging --- here you can turn on various Kitten
 497 Linux-like debugging features.   Only a few are noted below:
 498 \begin{itemize}
 499 \item Compile the kernel with debug info --- if this is on, you will
 500 have debugging information compiled in (-g)
 501 \item KGDB --- if you have this enabled, you will be able to attach to
 502 the running kernel from the GDB debugger.  This means you can also
 503 attach to Palacios, which is embedded.   If you want to debug Palacios
 504 using KGDB, be sure to turn on debugging in Palacios as well.
 505 \end{itemize}
 506 \end{itemize}
 507 \item Include Linux compatability layer --- if this is on, you can
 508 selectively add Linux system calls and other functionality to Kitten.
 509 Kitten is able to run Linux ELF executables as user processes with
 510 this layer. (off)
 511 \end{itemize}
 512
 513 The guest OS that is to be booted as a VM is included as a blob
 514 pointed to by ``Path to guest image''.   The blob starts with an XML
 515 description of the guest, followed by other chunks of data used, for
 516 example, as the content of virtual hard drives or CD ROMs.  Please see
 517 Section~\ref{sec:guestconfig} for basic information on how to use the
 518 guest builder to assemble a guest OS blob.
 519
 520 By default, the init task that is executed after Kitten boots (located
 521 in user/hello\_world) does a number of Kitten tests.  One of these is
 522 a test of the VMM API, which is implemented using Palacios.  When this
 523 test is done, a VM is created, configured according to the XML, and
 524 the guest OS blob is launched in it.
 525
 526
 527 \subsection{Compilation}
 528
 529 After Kitten has been configured it can be compiled.  This is done
 530 simply by executing
 531 \begin{verbatim}
 532 make isoimage
 533 \end{verbatim}
 534 This command will compile Kitten (with Palacios embedded in it) and
 535 the init task (which will contain the guest OS blob), and then
 536 assemble an ISO image file which can be used to boot a machine.  The ISO
 537 image is located at {\em ./arch/x86\_64/boot/image.iso}.
 538
 539 This image file can be used for booting a QEMU emulation environment,
 540 for booting a remote machine using PXE, or can be burned to CD/DVD for
 541 booting a machine physically.
 542
 543
 544 \section{Basic Guest Configuration}
 545 \label{sec:guestconfig}
 546
 547 A simple guest is created when you build Palacios.  To configure your
 548 own guest, you write an XML configuration file, which contains
 549 references to other files that contain data needed to instantiate
 550 stateful devices such as virtual hard drives and CD ROMs.  You supply
 551 this information to a guest builder utility that assembles a guest
 552 image suitable for reference in the Kitten configuration, as described
 553 above.
 554
 555 The guest builder utility is located in {\em
 556 palacios/utils/guest\_creator}.  You will need to run \verb.make. in
 557 that directory to compile it, resulting in the executable named {\em
 558 build\_vm}\footnote{This executable will also be copied into the top-level
 559 Palacios directory}.  Also located in that directory is an example
 560 configuration file, named {\em default.xml}.  We typically use this
 561 file as a template.  It is carefully commented.  In summary, a
 562 configuration consists of
 563 \begin{itemize}
 564 \item Physical memory size of the guest
 565 \item Basic VMM settings, such as what form of virtual paging is to be
 566 used, the scheduler rate, whether services like telemetry are on, etc.
 567 \item A memory map that maps regions of the host physical address
 568 space to the guest physical address space.  This can, for example,
 569 make a framebuffer visible in the guest.
 570 \item A list of the files that will be used in assembling the image.
 571 For example, the contents of a boot CD.
 572 \item A list of the devices that the guest will have, including
 573 configuration data for each device.
 574 \end{itemize}
 575 There are a few subtleties involved with devices.  One is that some
 576 devices form attachment points for other devices.  The PCI device is
 577 an example of this.  Another is that each device needs to define how
 578 it is attached (e.g. direct (implicit), via a PCI bus, etc.)
 579 Finally, there may be multiple instances of devices.   For example, a
 580 PCI passthrough device is instantiated for every underlying PCI device
 581 we want to make visible in the guest.
 582
 583 The XML configuration format is carefully designed to be extensible.
 584 For example, new devices could use additional or new configuration
 585 options.  The configuration parser in Palacios essentially ignores XML
 586 blocks it doesn't understand.
 587
 588
 589 To build a guest, one runs
 590 \begin{verbatim}
 591 palacios/utils/guest_creator/build_vm myconfig.xml -o myimage.dat
 592 \end{verbatim}
 593 Here, {\em myimage.dat} is the guest image that can be given to
 594 Kitten.
 595
 596 A common kind of guest used for testing is one that boots some form of
 597 bootable Linux distribution, or other a live OS distribution.  These
 598 distributions are CD ROM images (ISOs).  A range of them are available
 599 on {\em newskysaw} under {\em /opt/vmm-tools/isos}.  We often use
 600 Puppy Linux ({\em puppy.iso}) or Finnix ({\em finnix.iso}), for
 601 example, but isos are also available for Windows of different flavors,
 602 DOS, GeekOS, and others.  If you just want to use some guest ISO image
 603 like this, you can generally just copy the default XML file, and
 604 modify the
 605 \verb.filename=. attribute here:
 606 \begin{verbatim}
 607  <files>
 608     <!-- The file 'id' is used as a reference for
 609          other configuration components -->
 610     <file id="boot-cd" filename="/home/jarusl/image.iso" />
 611     <!--<file id="harddisk" filename="firefox.img" />-->
 612  </files>
 613 \end{verbatim}
 614
 615 For careful, repeatable experimentation, it is often convenient to
 616 build your own simplified Linux guest image.  It will boot {\em much}
 617 faster than a full blown distribution and you can readily set up an
 618 environment in which you can exert very tight control, being able to
 619 modify the Linux kernel, the included files (e.g., benchmarks), and
 620 other components very rapidly.  To learn more about how to do this,
 621 please consult the separate document named {\em Building a bootable
 622 guest image for Palacios and Kitten}.
 623
 624 \section{Running Palacios/Kitten}
 625 Kitten and Palacios are capable of running under QEMU, which makes
 626 debugging much simpler.  QEMU is a user-level Linux or Windows program
 627 that emulates a PC machine.
 628
 629 The basic form of the command to start the QEMU emulator is:
 630 \begin{verbatim}
 631 /usr/local/qemu/bin/qemu-system-x86_64 -smp 1 -m 1024 \
 632         -serial file:./serial.out \
 633         -cdrom ./arch/x86_64/boot/image.iso  \
 634         < /dev/null
 635 \end{verbatim}
 636
 637 The command starts up a single processor emulated machine, with 1GB of
 638 RAM and a CD-ROM drive loaded with the Kitten ISO image.  All output
 639 to the serial port is written directly to a file called {\em
 640   serial.out}. This command can be copied into a shell script for easy
 641 access.
 642
 643 We can also run Palacios/Kitten on physical hardware.  The slow way is
 644 to burn the Kitten ISO image onto a CD ROM and then boot the test
 645 machine with it.  The much faster way is to set the test machine up to
 646 use the PXE network boot system (most modern BIOSes support this), and
 647 boot your Kitten image over the network.  The debugging output will
 648 then appear on the actual serial port of the physical machine.  The
 649 separate document {\em Booting Palacios/Kitten Over the Network Using
 650 PXE} explains how to set up PXE boot and serial.  For the Northwestern
 651 environment, please talk to Jack, Peter, Lei, or Yuan if you need to
 652 be able to do this.  Northwestern has a range of AMD and Intel boxes
 653 for testing, as do UNM and Sandia.  A different form of network boot
 654 is used for Red Storm.
 655
 656
 657 \section{Development Guidelines}
 658
 659 There are standard requirements we have for code entering the mainline.
 660
 661 First and foremost, Palacios is designed to be OS independent and
 662 support 32-bit and 64-bit architectures. This means that developers should
 663 not include any external OS specific dependencies in any Palacios
 664 component. Also all changes need to be tested on both 32-bit and 64-bit
 665 architectures to make sure that they compile as well as run correctly.
 666
 667 \paragraph*{Coding Style}
 668
 669 ``The use of equal negative space, as a balance to positive space, in a
 670 composition is considered by many as good design. This basic and often
 671 overlooked principle of design gives the eye a 'place to rest,'
 672 increasing the appeal of a composition through subtle means.''
 673 \newline\newline
 674 Translation: Use the space bar, newlines, and parentheses.
 675
 676 Curly-brackets are not optional, even for single line conditionals.
 677
 678 Tabs should be 4 characters in width.
 679
 680 {\em Special:} If you are using XEmacs add the following to your \verb!init.el! file:
 681 \begin{verbatim}
 682 (setq c-basic-offset 4)
 683 (c-set-offset 'case-label 4)
 684 \end{verbatim}
 685
 686 {\em Bad}
 687 \begin{verbatim}
 688 if(a&&b==5||c!=0) return;
 689 \end{verbatim}
 690
 691
 692 {\em Good}
 693 \begin{verbatim}
 694 if (((a) && (b == 5)) ||
 695     (c != 0)) {
 696         return;
 697 }
 698 \end{verbatim}
 699
 700
 701
 702 \paragraph*{Fail Stop}
 703 Because booting a basic Linux kernel results in over 1 million VM exits
 704 catching silent errors is next to impossible. For this reason
 705 ANY time your code has an error it should return -1, and expect the
 706 execution to halt.
 707
 708 This includes unimplemented features and unhandled cases. These cases
 709 should ALWAYS return -1.
 710
 711
 712 \paragraph*{Function names}
 713 Externally visible function names should be used rarely and have
 714 unique names. Currently we have several techniques for achieving this:
 715
 716 \begin{enumerate}
 717 \item \verb.#ifdefs. in the header file
 718 \newline
 719 When the V3 Hypervisor is compiled it defines the symbol
 720 \verb.__V3VEE__. Any function that is not needed outside the Hypervisor
 721 context should be inside an \verb.#ifdef __V3VEE__. block, this will make it
 722 invisible to the host environment.
 723
 724 \item Static Functions
 725 \newline
 726 Any utility functions that are only needed in the .c file where they
 727 are defined should be declared as static and not included in the
 728 header file. You should make an effort to use static functions
 729 whenever possible.
 730
 731 \item \verb.v3_. prefix \newline Major interface functions should be
 732   named with the prefix \verb.v3_. This allows easy understanding of
 733   how to interact with the subsystems.  In the case that they need to
 734   be externally visible to the host OS, make them unlikely to collide
 735   with other functions.
 736 \end{enumerate}
 737
 738 \paragraph*{Debugging Output}
 739 Debugging output is sent through the host OS via functions in the
 740 \verb.os_hooks. structure. These functions have various wrappers of the form
 741 \verb.Print*., with \texttt{printf}-style semantics.
 742
 743 Two functions of note are \verb.PrintDebug. and \verb.PrintError..
 744
 745 \begin{itemize}
 746
 747 \item PrintDebug:
 748 \newline
 749 Should be used for debugging output that will often be turned off
 750 selectively by the VMM configuration.
 751
 752 \item PrintError
 753 \newline
 754 Should be used when an error occurs, this will never be optimized out
 755 and will always print.
 756 \end{itemize}
 757
 758
 759 \section{Code Submission}
 760 \label{sec:submission}
 761
 762 To commit changes to the central repository they need to be exported
 763 as a patch set that can be applied directly to a mainline. Both Git
 764 and Mercurial contain functionality to allow developers to maintain
 765 changes as a patch set. There are also a few options that make dealing
 766 with patches easier.
 767
 768 \subsection{Palacios}
 769
 770 Git includes support for directly exporting local repository commits
 771 as a patch set. The basic operation is for a developer to commit a
 772 change to a local repository, and then export that change as a patch
 773 that can be applied to another git repository.  Patch generation is
 774 done with {\em git format-patch}.  While this is functionally
 775 possible, there are a number of issues. The main problem is that it is
 776 difficult to fully encapsulate a new feature in a single commit, and
 777 dealing with multiple patches that often overwrite each other is not a
 778 viable option either. Furthermore, once a patch is applied to the
 779 mainline, it will generate a conflicting commit that will become
 780 present when the developer next pulls from the central
 781 repository. This can result in both repositories getting out of
 782 sync. It is possible to deal with this by manually re-basing the local
 783 repository, but it is difficult and error-prone.
 784
 785 This approach also does not map well when patches are being revised. A
 786 normal patch will go through multiple revisions as it is reviewed and
 787 modified by others. This often leads to synchronization issues as well
 788 as errors with patch revisions. Ultimately it is the responsibility of
 789 the developer to generate a patch that will apply cleanly to the
 790 mainline.
 791
 792 For this reason most internal developers should seriously consider
 793 {\em stacked git}. Stacked git is designed to make patch development
 794 easier and less of a headache. The basic mode of operation is for a
 795 developer to initialize a patch for a new feature and then
 796 continuously apply changes to the patch. Stacked Git allows a
 797 developer to layer a series of patches on top of a local git
 798 repository, without causing the repository to unsync due to local
 799 commits. Basically, the developer never commits changes to the
 800 repository itself but instead commits the changes to a specific
 801 patch. The local patches are managed using stack operations (push/pop)
 802 which allows a developer to apply and unapply patches as
 803 needed. Stacked git also manages new changes to the underlying git
 804 repository as a result of a pull operation and prevents collisions as
 805 changes are propagated upstream. For instance if you have a local
 806 patch that is applied to the mainline as a commit, when the commit is
 807 pulled down the patch becomes empty because it is effectively
 808 identical to the mainline. It also makes incorporating external
 809 revisions to a patch easier. Stacked git is installed on {\em
 810 newskysaw} and {\em newbehemoth} in \verb./opt/vmm-tools/bin/.
 811
 812 Brief command overview:
 813 \begin{itemize}
 814 \item \verb.stg init. -- initialize stacked git in a given branch
 815 \item \verb.stg new. -- create a new patch set; an editor will open
 816 asking for a commit message that will be used when the patch is
 817 ultimately committed.
 818 \item \verb.stg pop. -- pops a patch off of the source tree.
 819 \item \verb.stg push. -- pushes a patch back on to a source tree.
 820 \item \verb.stg export. -- exports a patch to a directory as a file
 821 that can then be emailed.
 822 \item \verb.stg refresh. -- commits local changes to the patch set at
 823 the top of the applied stack.
 824 \item \verb.stg fold. -- apply a patch file to the current
 825 patch. (This is how you can manage revisions that are made by other developers).
 826 \end{itemize}
 827
 828 You should definitely look at the online documentation to better
 829 understand how stacked git works. It is not required of course, but if
 830 you want your changes to be applied it's up to you to generate a patch
 831 that is acceptable to a core developer. Ultimately, using Stacked git
 832 should be easier than going it alone.
 833
 834
 835 All patches should be emailed to Jack for inclusion in the
 836 mainline. An overview of the organization is given in
 837 Figure~\ref{fig:process}. You should assume that the first revision of
 838 a patch will not be accepted, and that you will have to make
 839 changes. Furthermore, the final form of the patch most likely will not
 840 be exactly what you submitted.
 841
 842
 843 \begin{figure}[t]
 844 \begin{center}
 845 \includegraphics[height=3.5in]{dev_chart.pdf}
 846 \end{center}
 847 \caption{Development organization}
 848 \label{fig:process}
 849 \end{figure}
 850
 851
 852 \subsection{Kitten}
 853
 854 Writing code for Kitten follows essentially the same process as
 855 Palacios. The difference is that the patches need to be emailed to the
 856 Kitten developers. To send in a patch, you can just email it to the
 857 V3Vee development list.
 858
 859
 860 Also, instead of Stacked git you should use Mercurial patch
 861 queues. This feature is enabled in your .hgrc file.
 862 \begin{verbatim}
 863 [extensions]
 864 hgext.mq=
 865 \end{verbatim}
 866
 867 Mercurial queues use the same stack operations as stacked git, however
 868 it does not automatically handle the synchronization with pull
 869 operations. Before you update from the central version of Kitten you
 870 need to pop all of the patches, and then push them once the update is
 871 complete.
 872
 873 Basically:
 874 \begin{verbatim}
 875 hg qpop -a
 876 hg pull
 877 hg update
 878 hg qpush -a
 879 \end{verbatim}
 880
 881
 882
 883 \section{Networking}
 884
 885 Both the Kitten and GeekOS substrates on which Palacios can run
 886 currently include drivers for two simple network cards, the NE2000,
 887 and the RTL8139.  Palacios also supports passthrough I/O for PCI
 888 devices, meaning we can make NICs directly accessible by guests. The
 889 Kitten substrate is acquiring an ever increasing set of drivers for
 890 specialized network systems.  A lightweight networking stack is
 891 included so that TCP/IP networking is possible from within the host OS
 892 kernel and in Palacios.
 893
 894 When debugging Palacios on QEMU, it is very convenient to add an
 895 RTL8139 card to your QEMU configuration, and then drive it from within
 896 Palacios.  QEMU can be configured to provide local connectivity to the
 897 QEMU emulated machine, including bridging the emulated machine with a
 898 physical network.  Local connectivity can be done with redirection, or
 899 with a TAP interface.  For global connectivity, a TAP interface must
 900 be used; it is bridged to a physical interface.
 901
 902 \section{Configuring the development host's QEMU network}
 903
 904 To get local connectivity with redirection, no networking changes on
 905 the host are needed.  However, people usually want to use TAP-based
 906 networking, which does require changes.  For one thing, TAP interfaces
 907 can be inspected with tools like wireshark, which makes for much
 908 easier debugging of network code.
 909
 910 In order to get QEMU networking to function, it is necessary to create
 911 TAP interfaces, and, optionally, to bridge them to real networks.  A
 912 development machine typically will have several TAP interfaces, and
 913 more can be created.  Generally, each developer should have a TAP
 914 interface of his or her own.  Here we use newskysaw as an example.
 915
 916 To set up a TAP interface on newskysaw, the following command is used:
 917 \begin{verbatim}
 918 /root/util/tap_create tapX
 919 \end{verbatim}
 920
 921 When QEMU runs with a tap interface, it will use /etc/qemu-ifup to
 922 bring up the interface.  /etc/qemu-ifup looks like this:
 923
 924 \begin{verbatim}
 925 #!/bin/bash
 926 echo "Executing /etc/qemu-ifup - no external bridging"
 927 echo "Bringing up $1 for bridged mode..."
 928 NET=`echo $1 | cut -dp -f2`
 929 sudo /sbin/ifconfig $1 172.2${NET}.0.1 up
 930 sleep 2
 931 \end{verbatim}
 932
 933 The interface tap$N$ is brought up with the IP address 172.2$N$.0.1.
 934 ifconfig will also create a routing rule that sends 172.2$N$.0.1/16
 935 traffic to tap$N$.  The upshot is that if the code running in QEMU
 936 uses an IP address in this network (for example: 172.2$N$.0.2), you
 937 will be able to talk to it from newskysaw.  For example, from
 938 newskysaw, if you ping 172.21.0.2, the packet (and ARP) will go out via
 939 tap1.  The source address will appear to be 172.21.0.1.  The QEMU
 940 machine will see these packets on its interface, and the software
 941 controlling its interface can respond to 172.21.0.1.
 942
 943 This form of networking is local to the machine.  You can also bridge
 944 a TAP interface with a physical interface.  The result of this is that
 945 a packet sent on it will be sent on the physical interface.  To do
 946 this requires more effort (and is not set up by default on newskysaw).
 947 As an example, consider that on newskysaw, the physical interface eth1
 948 is connected to a private network switch to which the lab test
 949 computers (v-test-amd, v-test-amd2, etc.) are connected.  To bridge,
 950 for example, tap10, to this interface, you would do the following
 951 (with root's help):
 952 \begin{enumerate}
 953 \item You need to bring up eth1 (ifconfig eth1 up {\em address}
 954 netmask {\em mask}).  It is important that the address and mask you
 955 choose are appropriate for the network eth1 is connected to.
 956 \item You would bring up tap10 without an address:  /sbin/ifconfig
 957 tap10 up
 958 \item You would bridge tap10 and eth1:  /usr/sbin/brctl addif br0
 959 tap10; /usr/sbin/brctl addif eth1.  This assumes that br0 was
 960 previously created.
 961 \end{enumerate}
 962
 963 Bridging tap$N$ with eth1 will only work (where ``work'' means sending
 964 a packet on the network and making the packet visible on localhost) if
 965 the IP address in the code running in QEMU is set correctly.  This
 966 means that it needs to be set to correspond to the network of eth1).
 967 For the newskysaw configuration, this is a 10-net address.
 968
 969
 970 \subsection{Configuring Kitten}
 971
 972 Kitten needs to be explicitly configured to use networking. Currently
 973 only a subset of the networking configurations are supported. To
 974 enable an Ethernet network you should enable the following options:
 975
 976 \begin{itemize}
 977 \item Enable TCP Support
 978 \item Enable UDP Support
 979 \item Enable socket API
 980 \item Enable ARP support
 981 \end{itemize}
 982
 983 The other options are not supported, and enabling them will probably
 984 break the kernel compilation.
 985
 986 To allow Kitten to communicate with the QEMU network card you also
 987 need to enable the appropriate device driver: \newline
 988 \verb.NE2K Device Driver (rtl8139).
 989
 990 The driver then needs to be listed as a Kernel Command Line argument
 991 in the {\em ISOIMAGE configuration}. To do this add
 992 \verb.net=rtl819. to the end of the argument string.
 993
 994 Kitten currently does not support the dynamic assignment or IP
 995 addresses at runtime. Because of this it is necessary to hardcode the
 996 IP address into the device driver. For the rtl8139 network driver look
 997 in the file {\em drivers/net/ne2k/rtl8139.c} for the function
 998 \verb.rtl8139_init..
 999
1000 There should be a block of code that looks like the following:
1001 \begin{verbatim}
1002   struct ip_addr ipaddr = { htonl(0 | 10 << 24 | 0 << 16 | 2 << 8 | 16 << 0) };
1003   struct ip_addr netmask = { htonl(0xffffff00) };
1004   struct ip_addr gw = { htonl(0 | 10 << 24 | 0 << 16 | 2 << 8 | 2 << 0) };
1005 \end{verbatim}
1006
1007 This sets the IP address as 10.0.2.16, netmask 255.255.255.0 and
1008 gateway address 10.0.2.2. Change these assignments to match your configuration.
1009
1010
1011 \paragraph*{Kitten as the Guest OS}
1012
1013 When running Kitten as a VM, the above applies except that you will
1014 want to enable the {\em VMNET} device driver instead of the {\em rtl8139}.
1015
1016
1017 \subsection{Running with networking}
1018
1019 \paragraph*{TAP Interface}
1020 Running with a TAP interface provides either local or global
1021 connectivity (depending on how the TAP interface is configured and/or
1022 bridged).  From the perspective of the QEMU command line, both look
1023 the same, however.  You simply add something like this to the command
1024 line:
1025 \begin{verbatim}
1026 -net tap,ifname=tap2 -net nic,model=rtl8139
1027 \end{verbatim}
1028 The first \verb.-net. option indicates that you want to use a tap
1029 interface, specifically \verb.tap2..   The second \verb.-net. option
1030 specifies that this interface will appear to code in the QEMU machine
1031 to be a network interface card of the specific model RTL8139.  Note
1032 that this is a model for which we have a driver.  If tap2 were
1033 bridged, we'd get global connectivity.  If not, we would just get
1034 local connectivity.
1035
1036
1037 \paragraph*{Redirection}
1038 It is also possible to achieve limited local connectivity even if you
1039 have no TAP support on your development machine.  In redirection, QEMU
1040 essentially acts as a proxy, translating TCP or other connections and
1041 low-level packet operations on the network interface in the QEMU
1042 machine.  For example, the following options will redirect the host's
1043 9555 port to the QEMU machine's 80 port:
1044 \begin{verbatim}
1045 -net user -net nic,model=rtl8139  -redir tcp:9555:10.10.10.33:80
1046 \end{verbatim}
1047 The first \verb.-net. option indicates that we are using user-level
1048 networking (proxying).  The second \verb.-net. option indicates that
1049 this user-level network will appear in the QEMU machine as an RTL8139
1050 network card.   The \verb.-redir. option indicates that connections on
1051 localhost:9555 will be translated into equivalent packet exchanges on
1052 the RTL8139 card in the QEMU machine.  However, we have to tell QEMU
1053 which IP address and port to use on the QEMU machine's side.  This is
1054 what the 10.10.10.33 address, and port 80 are.  In the example, if you
1055 access port 9555 on localhost, say with:
1056 \begin{verbatim}
1057 telnet localhost 9555
1058 \end{verbatim}
1059 The packets that appear in the QEMU machine will be bound for
1060 10.10.10.33, port 80.  Within the QEMU machine, your RTL8139 interface
1061 had better then be up on that address.
1062
1063 QEMU has many options to build up virtual or real networking. See
1064 http://www.h7.dion.ne.jp/$\sim$qemu-win/HowToNetwork-en.html for more
1065 information.
1066
1067
1068 For more questions, talk to Jack, Lei,  or Peter.
1069
1070 \end{document}