Edits of user manual and guest builder manual

[palacios.git] / manual / manual.tex
diff --git a/manual/manual.tex b/manual/manual.tex

index d779462..91591ae 100755 (executable)
--- a/manual/manual.tex
+++ b/manual/manual.tex
@@ -22,6 +22,8 @@
 \setlength\parindent{0in}
 \setlength\parskip{0.1in} 
 
+\newcommand{\note}[1]{{$\rightarrow$ \bf Note: \emph{#1}}}
+
 \begin{document}
 
 \title{
@@ -35,13 +37,24 @@ Palacios Internal Developer Manual
 \maketitle
 
 
-This manual is written for Internal Palacios developers. It contains
-information on how to obtain the palacios code base, how to go about
+This manual is written for internal Palacios developers. It contains
+information on how to obtain the Palacios code base, how to go about
 the development process, and how to commit those changes to the
 mainline source tree.  This assumes that the reader has read {\em An
 Introduction to the Palacios Virtual Machine Monitor -- Release 1.0}
-and also has a slight working knowledge of {\em git}.
+and also has a slight working knowledge of {\em git}.  You will also
+want to read the document {\em Building a bootable guest image for
+Palacios and Kitten} in order to understand how to build an extremely
+lightweight guest environment, suitable for testing.
+
+Please note that Palacios and Kitten are under rapid development,
+hence this manual may very well be out of date!
 
+\newpage
+\tableofcontents
+\newpage
+\listoffigures
+\newpage
 
 \section{Overview}
 
@@ -51,9 +64,9 @@ uses both the centralized repository and distributed development
 models. A central repository exists that holds the master version of
 the code base. This central repository is cloned by multiple people
 and in multiple places to support various development efforts. A
-feature of git is that every developer actually has a fully copy of
+feature of \texttt{git} is that every developer actually has a full copy of
 the entire repository, and so can function independently until such
-time as they need to resync with the master version. 
+time as they need to re-sync with the master version. 
 
 There are typically multiple levels of access to the central
 repository, that are granted based on the type of developer being
@@ -61,19 +74,24 @@ granted access. The three basic developer types and their access
 privileges are:
 
 \begin{itemize}
-\item Core Developers: These are the lead developers and are in
+\item Core developers: These are the lead developers and are in
 charge of managing the master repository. They have full read/write
 access permissions to the central repository.
 
-\item Internal Developers: Formal members of the development
+\item Internal developers: Formal members of the development
 team. These people are capable of pulling directly from the central
 repository, but lack the ability to write directly to it. 
 
-\item External Developers: People who are not actual members of the
+\item External developers: People who are not actual members of the
 development team. These people can only access the public repository
 which is only updated to contain the release versions. 
 \end{itemize}
 
+Students doing independent study or REUs related to Palacios are set
+up as internal developers.  EECS 441 (Resource Virtualization)
+students are generally either set up as internal or external
+developers, depending on their projects.
+
 Because internal and external developers cannot write directly to the
 master repository, they need to first submit their changes to a core
 developer before they can be added to the mainline. We will discuss
@@ -81,7 +99,8 @@ that process in Section~\ref{sec:submission}.
 
 
 \section{Checking out Palacios}
-The central palacios repository is located on {\em
+
+The central Palacios repository is located on {\em
 newskysaw.cs.northwestern.edu} in {\em /home/palacios/palacios}. All
 internal developers have read access to the directory. Each developer
 must create their own local version of the repository, this is done
@@ -91,11 +110,42 @@ with {\em git clone}.
 git clone /home/palacios/palacios
 \end{verbatim}
 
+On the machine {\em newbehemoth.cs.northwestern.edu} you will want to
+use the following command instead. The {\em newskysaw} home
+directories are NFS-mounted on /home-remote.
+
+
+\begin{verbatim}
+git clone /home-remote/palacios/palacios
+\end{verbatim}
+
+
+On any other machine, you can clone the repository via ssh, provided
+you have a newskysaw account:
+ 
+\begin{verbatim}
+git clone ssh://you@newskysaw.cs.northwestern.edu//home/palacios/palacios
+\end{verbatim}
+
+External developers can clone the public repository via the web.  The
+public repository tracks the release and main development branch
+(e.g., devel) of the internal repository with a 30 minute delay.  The
+access information is available on the web site (http://v3vee.org).
+The web site also includes a publically accessible gitweb interface to
+allow browsing of the public repository.
+
 This creates a local copy of the repository at {\em ./palacios/}.
 
+Note that both {\em newskysaw} and {\em newbehemoth} have all the
+tools installed that are needed to build and test Palacios and Kitten.
+If you develop on another machine, you will need to set those tools up
+for yourself.  This isn't hard and the tools are all free.  See the
+technical report for what tools you will need.
 
-All development work is done in the {\em devel} branch of the
-repository. The developer can access this branch via:
+When you first clone the repository, you will get the {\em master}
+branch, which is used to generate releases.   All development work is
+done in the {\em devel} branch of the repository. The developer can
+access this branch via:
 
 \begin{verbatim}
 git checkout --track -b devel origin/devel
@@ -108,7 +158,7 @@ or
 \end{verbatim}
 
 {\em Important:}
-Note that palacios is very actively developed so the contents of the
+Note that Palacios is very actively developed so the contents of the
 {\em devel} branch are frequently changing. In order to keep up to
 date with the latest version, it is necessary to periodically pull the
 latest changes from the master repository by running \verb.git pull..
@@ -118,106 +168,218 @@ latest changes from the master repository by running \verb.git pull..
 \section{Checking out Kitten}
 
 Kitten is available from Sandia National Labs, and is the main host OS
-we are targetting with Palacios. Loosely speaking core Palacios
+we are targeting with Palacios. Loosely speaking, core Palacios
 developers are internal Kitten developers, and internal Palacios
-developers are external Kitten developers. Because we have limited
-access to the Kitten repository, we are maintaining a local mirror
-copy in {\em /home/palacios/kitten}. 
+developers are external Kitten developers. The public repository for
+Kitten is at {\em http://code.google.com/p/kitten}.  To simplify things,
+we are maintaining a local mirror copy in {\em /home/palacios/kitten}
+that tracks the public repository.
 
 Kitten uses Mercurial for their source management, so you will have to
 make sure the local mercurial version is configured correctly.
-Specifically you should add the following python path to your shell environment.
+Specifically you should add the following Python path to your shell environment.
 
 \begin{verbatim}
 export PYTHONPATH=/usr/local/lib64/python2.4/site-packages/
 \end{verbatim}
 
-You can then clone Kitten from the local mirror:
+You can then clone Kitten from the local mirror.   On {\em newskysaw},
+run: 
 \begin{verbatim}
 hg clone /home/palacios/kitten
 \end{verbatim}
+On {\em newbehemoth}, run
+\begin{verbatim}
+hg clone /home-remote/palacios/kitten
+\end{verbatim}
+On other machines, run
+\begin{verbatim}
+hg clone ssh://you@newskysaw.cs.northwestern.edu//home/palacios/kitten
+\end{verbatim}
+
 
 Both the Kitten and Palacios clone commands should be run from the
-same direcotyr. This means that both repositories should be located at
+same directory. This means that both repositories should be located at
 the same directory level. The Kitten build process depends on this.
 
-{\em Important:} Like Palacios, Kitten is very actively developed so
-source tree is frequently changing. In order to keep up to date with
-the latest version, it is necessary to periodically pull the latest
-changes from the mirror repository by running \verb.hg pull. followed
-by \verb.hg update..
+{\em Important:} Like Palacios, Kitten is under active development,
+and its source tree is frequently changing. In order to keep up to
+date with the latest version, it is necessary to periodically pull the
+latest changes from the mirror repository by running \verb.hg pull.
+followed by \verb.hg update..
 
 \section{Compiling Palacios}
-Palacios is capable of targeting 32 and 64 bit operating systems, and
-includes a build process that supports both these
-architectures. Furthermore, Palacios has multiple build locations,
-with multiple makefiles: a top level build directory and a Palacios
-specific build directory. The Palacios build process first generates a
-static library that includes the Palacios VMM. This static library is
-then linked into a host operating system. Palacios internally supports
-GeekOS and can generate a complete OS image via a unified build
-process. To combine Palacios with Kitten, it is necessary to first
-compile Palacios and then to compile Kitten externally link it with
-Palacios. The output of the compilation process is a bit more complex
-and generates multiple binaries, and the specifics can be found in the
-Makefiles.
-
-The top level build directory provides a number of high level make
-targets, and is located in {\em palacios/build/}. It supports building
-32 and 64 bit versions of the Palacios library independently as well
-as building an integrated version of GeekOS.   The basic targets are:
+
+The Palacios build process has been changed recently from a homegrown
+environment to the widely used KBuild environment.  KBuild is also
+used for building Kitten.  Because KBuild is the build environment
+used for Linux, much of what you learn about configuring and building
+Linux kernels is readily applicable to Palacios and Kitten. 
+
+The output of the Palacios build process is a static library that
+includes the Palacios VMM and relevant guest support code blocks. This
+static library is then linked into a host operating system. Palacios
+internally supports GeekOS and can generate a complete OS image via a
+unified build process.  By complete OS image, we mean an ISO image
+containing GeekOS, Palacios, and a guest image (another ISO image) for
+testing. 
+
+These days, however, Palacios is typically embedded into Kitten, not
+GeekOS.  To combine Palacios with Kitten, it is necessary to first
+configure and build Palacios, then to configure and build Kitten,
+linking in Palacios.   Kitten can also be configured to link in a
+guest image for testing.  
+
+\subsection{Configuration}
+
+To configure Palacios, enter the top level Palacios directory and
+execute:
+\begin{verbatim}
+make clean
+make xconfig
+\end{verbatim}
+At this point, you will see be presented with a KBuild configuration
+screen, similar to what you would see in configuring a Linux kernel.
+Palacios has far fewer options, however.   If you don't have X or
+don't want the graphical configuration system, you can also use the
+\verb.menuconfig. or \verb.config. targets.  The available options
+change over time, so we do not cover all of them here, but here are a
+few that are usually important, with their recommended values noted:
+\begin{itemize}
+\item Target Configuration:   
 \begin{itemize}
-\item \verb.make palacios-full32. -- Generates a 32 bit version of the Palacios static library 
-\item \verb.make palacios-full64. -- Generates a 64 bit version of the
-Palacios static library
-\item \verb.make geekos. -- Compiles the GeekOS kernel, and link it with the
-Palacios static library 
-\item \verb.make geekos-iso. -- Generate an ISO boot disk image from the
-GeekOS kernel that has been compiled
-\end{itemize}
 
-The second build directory is located at {\em palacios/palacios/build}
-and handles only the Palacios compilation process. It supports a
-differnt set of targets and arguments:
+\item Red Storm (Cray XT3/XT4) --- turn on to target Cray XT4
+supercomputers.  (off)
+\item AMD SVM Support --- targets AMD processors with the SVM hardware
+virtualization features (on)
+\item Intel VMX support --- targets Intel processors with the VMX
+hardware virtualization features (on)
+\item Compile for a multi-threaded OS (on)
+\item Enable VMM telemetry support --- this is lightweight logging and
+data collection (on)  
+\item Enable VMM instrumentation --- this is heavyweight logging and
+data collection (off)
+\item Enable passthrough video --- this lets a guest write directly to
+the video card (on)
+\item Enable experimental options --- this makes it possible to select
+features that are under current development (on).  You probably want
+to leave VNET turned off.  VNET is an experimental VMM-embedded
+overlay network under development by Lei Xia and Yuan Tang.
+\item Enable built-in versions of stdlib functions --- this adds
+needed stdlib functions that the host OS may not supply.  For use with
+Kitten turn on and enable strcasecmp() and atoi().
+\item Enable built-in versions of stdio functions (off)
+\end{itemize}
+\item Symbiotic Functions (these are experimental options for Jack
+Lange's thesis). 
 \begin{itemize}
-\item \verb.make ARCH=32. -- iteratively compiles a 32 bit version of Palacios
-\item \verb.make ARCH=64. -- iteratively compiles a 64 bit version of
-Palacios
-\item \verb.make ARCH=32 world. -- fully recompiles a 32 bit version of
-Palacios
-\item \verb.make ARCH=64 world. -- fully recompiles a 64 bit version of
-Palacios
+\item Enable Symbiotic Functionality --- This adds symbiotic features
+to Palacios, specifically support for discovery and configuration by
+symbiotic guests, the SymSpy passive information interface for
+asynchronous symbiotic guest $\leftrightarrow$ symbiotic VMM
+information flow, and the SymCall functional interface for synchronous
+symbiotic VMM $\rightarrow$ symbiotic guest upcalls.  (on)
+\item Symbiotic Swap --- Enables the SwapBypass symbiotic service for
+symbiotic Linux guests.  (off)
 \end{itemize}
-
-Both build levels support compilation directives that control the
-debugging messages that are generated by Palacios. These are specified
-by appending a \verb.DEBUG_<COMPONENT>=1. to the end of the
-\verb.make. command. The components that are currently supported are:
+\item Debug Configuration
+\begin{itemize}
+\item Compile with Debug info --- adds debug symbols (-g)  (off)
+\item Enable Debugging --- makes it possible to show PrintDebug output
+(on).   You can selectively turn on debugging output for each major
+VMM component, including shadow paging, nested paging, control
+registers, interrupts, I/O, instruction emulation and XED, halt, and
+the device manager.   Note that the more debugging output you turn on,
+the slower the VMM will go since it will have to wait for the prints
+to finish. 
+\end{itemize}
+\item BIOS Selection --- Lets you select which code blobs will be
+used for bootstrapping the guest.   There are currently three:  a
+BIOS, a Video BIOS, and the VMXAssist V8086 service (the latter is used only on
+Intel VMX).   Generally, you should not need to change these.
+\item Virtual Devices --- virtual devices can be instantiated and
+added to a guest.   The following is a list of the currently
+implemented virtual devices.
+\begin{itemize}
+\item BOCHS Debug Console Device --- used for debugging output from
+the guest BIOS (on)
+\item OS Debug Console Device --- used for debugging output from
+the guest kernel (on)
+\item 8259A PIC - legacy Programmable Interrupt Controller chip --- used
+for bootstrap of most guests (on)
+\item APIC - in-processor Advanced Programmable Interrupt Controller --- used
+for interrupt delivery on almost all guests (on)
+\item IOAPIC - Off-chip APIC  --- used for interrupt deliver for almost
+all guests (on)
+\item i440fx Northbridge --- emulation of a typical PC North Bridge
+chip, used on almost all guests (on)
+\item PIIX3 Southbridge --- emulation of a typical PC South Bridge
+chip, used on almost all guests (on)
+\item PCI --- emulation of a PCI bus - needed for attaching most
+devices (on)   
+\begin{itemize}
+\item Passthrough PCI --- allows us to make a hardware PCI device visible and
+directly accessible by the guest (on)
+\end{itemize}
+\item Generic --- this is a run-time configurable device that can intercept I/O port read/writes and memory region reads/writes.   Intercepted reads and writes can either be ignored or forwarded to actual hardware, and the data flow can optionally be printed.   This is a useful tool with at least three purposes.  First, it makes it possible to ``stub out'' hardware that isn't currently implemented and for which we don't want to allow passthrough access. Second, it makes it possible to provide low-level passthrough access to physical hardware.   Third, it can be used to spy on guest/device interactions, which is very helpful when trying to understand the interface of a device.
+\item NVRAM - motherboard configuration memory --- needed by BIOS bootstrap (on)
+\item Keyboard - Generic PS/2 keyboard, including mostly broken mouse
+implementation (on)
+\item IDE --- Support for virtual IDE controllers that support disks
+and CD ROMs (on)
+
+\item NE2K - NE2000 and RTL8139 network devices (off)
+\item CGA - CGA video card (paritial implementation) (off)
 \begin{itemize}
-\item \verb.DEBUG_ALL=1. -- enables debugging for all the VMM components
-({\em Warning:} this generates a {\em lot} of debug information.
-\item \verb.DEBUG_SHADOW_PAGING=1.
-\item \verb.DEBUG_CTRL_REGS=1.
-\item \verb.DEBUG_INTERRUPTS=1.
-\item \verb.DEBUG_IO=1.
-\item \verb.DEBUG_KEYBOARD=1.
-\item \verb.DEBUG_PIC=1.
-\item \verb.DEBUG_PIT=1.
-\item \verb.DEBUG_NVRAM=1.
-\item \verb.DEBUG_GENERIC=1.
-\item \verb.DEBUG_EMULATOR=1.
-\item \verb.DEBUG_RAMDISK=1.
-\item \verb.DEBUG_XED=1.
-\item \verb.DEBUG_HALT=1.
-\item \verb.DEBUG_DEV_MGR=1.
-\item \verb.DEBUG_APIC=1.
+\item Telnet Virtual Console (off)   When CGA and Telnet Console are
+on, it is possible to telnet to the console of the guest.   Eventually
+the rest of this will provide simple bitmapped video console for VNC
+access.
 \end{itemize}
+\item RAMDISK storage backend --- used to create RAM disk
+implementations of block devices (on)
+\item NETDISK storage backend --- used to create network-attached disk
+implementations of block devices, e.g., network block devices (on)
+\item TMPDISK storage backend --- used to create temporary storage
+implementations of block devices (on)
+\item Linux Virtio Balloon Device --- used for memory ballooning by
+Linux virtio-compatible guests (on)
+\item Linux Virtio Block Device --- used for fast block device support
+by Linux virtio-compatible guests (on)
+\item Linux Virtio Network Device --- used for fast network device support
+by Linux virtio-compatible guests (on) 
+\item Symbiotic Swap Disk (multiple versions) --- used for the
+SwapBypass service (off)
+\item Disk Performance Model --- used for the
+SwapBypass service (off)
+\end{itemize}
+\end{itemize}
+
+\subsection{Compilation}
 
+After configuring Palacios---remember to save your changes---you can
+compile it by executing
+\begin{verbatim}
+make 
+\end{verbatim}
+This will produce the file {\em libv3vee.a} in the current directory.
+This static library contains the Palacios VMM and is ready for
+embedding into an OS, such as Kitten.  The library provides the
+ability to instantiate and run virtual machines.  By default, on a 64
+bit machine, the library is compiled for 64 bit machines (x86\_64),
+while on a 32 bit machine, it is compiled for 32 bit machines.  You
+can override this using the ARCH=i386 or ARCH=x86\_64 arguments to the
+make, provided you have the relevant tools available.  The 64 bit
+version is what you need for use with Kitten.  A 64 bit Palacios can
+run both 64 and 32 bit guests.  Both {\em newskysaw} and {\em
+newbehemoth} are 64 bit machines. 
 
 
 \section{Compiling Kitten}
-Kitten requires a 64 bit version of Palacios, so make sure that
-Palacios has been correctly compiled before compiling Kitten.
+Kitten requires a 64-bit version of Palacios, so make sure that
+Palacios has been correctly compiled before compiling Kitten.  The
+current default for Palacios is 64 bit. 
 
 \subsection{Configuration}
 Kitten borrows a lot of concepts from Linux, including the Linux build
@@ -230,52 +392,189 @@ accessed via any of these make targets.
 \item \verb.make menuconfig.
 \end{itemize}
 
-There are some specific configuration options that should be disabled
-to work with Palacios. Because Palacios is configured by default to
-provide a guest with direct access to the VGA console, the {\em VGA
-console} device driver should be disbabled in the Kitten
-configuration. Similarly the {\em VM console} driver should be
-disabled as well.
+Of course, there are a range of configuration options.  In the
+following, we note only the most important:
+\begin{itemize}
+\item Target Configuration
+\begin{itemize}
+\item System Architecture --- you probably want to set this to
+PC-Compatible, unless you are working on Red Storm.
+\item Processor Family --- you want to set this to either
+AMD-Opteron/Athlon64 or Intel-64/Core2, depending on whether you have
+a 64 bit AMD or 64 bit Intel processor.
+\end{itemize}
+\item Virtualization
+\begin{itemize}
+\item Include Palacios VMM --- this will link against the Palacios
+library (on)
+\item Path to pre-built Palacios tree --- directory where libv3vee.a
+can be found.
+\item Path to guest image --- location of the test guest OS
+image that will be embedded.  We will say more about this later.
+Essentially, however, a guest image consists of a blob that begins
+with an XML description of the desired guest environment and the
+contents of the remainder of the blob.   The remainder of the blob
+usually contains disk or cd images.
+\end{itemize}
+\item Networking 
+\begin{itemize}
+\item Enable LWIP TCP/IP stack.  This activates a simple TCP/IP stack
+that things like NETDISK can use. (on)
+\end{itemize}
+\item Device Drivers
+\begin{itemize}
+\item VGA Console --- driver for basic video.  If you turn on
+passthrough video in Palacios, you should turn this off.
+\item Serial Console --- driver for serial port console.  (on)
+\item VM Console --- driver for Kitten console on top of Palacios.  If
+Kitten is run {\em as a guest}, and it has VM Console on, then it can output
+cleanly via the Palacios OS Console device (off).
+\item NE2K Device Driver --- driver for NE2K and RTL8139 network cards
+(on)
+\item VM Network Driver --- driver for Kitten network output using
+Palacios.  If Kitten is run {\em as a guest}, and it has VM Network
+Driver, then it can send and receive packets using the Palacios Linux
+virtio network device.  (off)
+\end{itemize}
+\item ISOIMAGE configuration:  Kitten kernel arguments.   Note that
+this is NOT for the guest image, but rather for the Kitten image.  You
+can leave this alone.  For Palacios operation, it's important that the
+option \verb.console=serial. appears.  If the NE2K/RTL8139 driver
+should be used \verb.net=rtl8139. should appear. 
+\item Kernel Hacking
+\begin{itemize}
+\item Kernel Debugging --- here you can turn on various Kitten
+Linux-like debugging features.   Only a few are noted below:
+\begin{itemize}
+\item Compile the kernel with debug info --- if this is on, you will
+have debugging information compiled in (-g)
+\item KGDB --- if you have this enabled, you will be able to attach to
+the running kernel from the GDB debugger.  This means you can also
+attach to Palacios, which is embedded.   If you want to debug Palacios
+using KGDB, be sure to turn on debugging in Palacios as well.
+\end{itemize}
+\end{itemize}
+\item Include Linux compatability layer --- if this is on, you can 
+selectively add Linux system calls and other functionality to Kitten.
+Kitten is able to run Linux ELF executables as user processes with
+this layer.   
+\end{itemize}
 
-Furthermore, because the VGA console is not being used the {\em Kernel
-Command Line Arguments} must be modified to remove the {\em VGA}
-device from the console list.
+The guest OS that is to be booted as a VM is included as a blob
+pointed to by ``Path to guest image''.   The blob starts with an XML
+description of the guest, followed by other chunks of data used, for
+example, as the content of virtual hard drives or CD ROMs.  Please see
+Section~\ref{sec:guestconfig} for basic information on how to use the
+guest builder to assemble a guest OS blob. 
 
-The guest OS that is booted as a VM is included as an ISO image in raw
-binary format inside Kitten's {\em init\_task}. To change the guest
-ISO, you must change the makefile for the init\_task. This is located
-in {\em user/hello\_world/Makefile} and the syntax is well commented.
-On {\em newskysaw} a collection of guest ISO images are located in
-{\em /opt/vmm-tools/isos/}. 
+By default, the init task that is executed after Kitten boots (located
+in user/hello\_world) does a number of Kitten tests.  One of these is
+a test of the VMM API, which is implemented using Palacios.  When this
+test is done, a VM is created, configured according to the XML, and
+the guest OS blob is launched in it.
 
 
 \subsection{Compilation}
-After Kitten has been configured the compilation can be done. The
-general process is to compile a reference build of Kitten, followed by
-compiling Palacios support as a kernel module, and then doing a new
-full recompilation of Kitten.
 
-The specific compilation steps are run from the top level Kitten directory:
+After Kitten has been configured it can be compiled.  This is done
+simply by executing
 \begin{verbatim}
-make
-cd palacios
-make -C .. M=`pwd`
-cp built-in.o ../modules/palacios-mod.o
-cd ..
-make
 make isoimage
 \end{verbatim}
+This command will compile Kitten (with Palacios embedded in it) and
+the init task (which will contain the guest OS blob), and then
+assemble an ISO image file which can be used to boot a machine.  The ISO
+image is located at {\em ./arch/x86\_64/boot/image.iso}.  
+
+This image file can be used for booting a QEMU emulation environment,
+for booting a remote machine using PXE, or can be burned to CD/DVD for
+booting a machine physically. 
+
+
+\section{Basic Guest Configuration}
+\label{sec:guestconfig}
+
+To configure a guest, you write an XML configuration file, which
+contains references to other files that contain data needed to
+instantiate stateful devices such as virtual hard drives and CD ROMs.
+You supply this information to a guest builder utility that assembles
+a guest image suitable for reference in the Kitten configuration, as
+described above.  
+
+The guest builder utility is located in {\em
+palacios/utils/guest\_creator}.  You will need to run \verb.make. in that
+directory to compile it, resulting in the executable named {\em
+build\_vm}.  Also located in that directory is an example configuration
+file, named {\em default.xml}.   We typically use this file as a
+template.  It is carefully commented.  In summary, a configuration
+consists of
+\begin{itemize}
+\item Physical memory size of the guest
+\item Basic VMM settings, such as what form of virtual paging is to be
+used, the scheduler rate, whether services like telemetry are on, etc.
+\item A memory map that maps regions of the host physical address
+space to the guest physical address space.  This can, for example,
+make a framebuffer visible in the guest.
+\item A list of the files that will be used in assembling the image.
+For example, the contents of a boot CD.
+\item A list of the devices that the guest will have, including
+configuration data for each device.
+\end{itemize}
+There are a few subtleties involved with devices.  One is that some
+devices form attachment points for other devices.  The PCI device is
+an example of this.  Another is that each device needs to define how
+it is attached (e.g. direct (implicit), via a PCI bus, etc.)
+Finally, there may be multiple instances of devices.   For example, a
+PCI passthrough device is instantiated for every underlying PCI device
+we want to make visible in the guest. 
+
+The XML configuration format is carefully designed to be extensible.
+For example, new devices could use additional or new configuration
+options.  The configuration parser in Palacios essentially ignores XML
+blocks it doesn't understand. 
+
 
-This generates an ISO boot image containing Kitten, Palacios, and the
-guest that will be run as a VM. The ISO image is located at {\em
-./arch/x86\_64/boot/image.iso}.
+To build a guest, one runs
+\begin{verbatim}
+palacios/utils/guest_creator/build_vm myconfig.xml -o myimage.dat
+\end{verbatim}
+Here, {\em myimage.dat} is the guest image that can be given to
+Kitten. 
+
+A common kind of guest used for testing is one that boots some form of
+bootable Linux distribution, or other a live OS distribution.  These
+distributions are CD ROM images (ISOs).  A range of them are available
+on {\em newskysaw} under {\em /opt/vmm-tools/isos}.  We often use
+Puppy Linux ({\em puppy.iso}) or Finnix ({\em finnix.iso}), for
+example, but isos are also available for Windows of different flavors,
+DOS, GeekOS, and others.  If you just want to use some guest ISO image
+like this, you can generally just copy the default XML file, and
+modify the
+\verb.filename=. attribute here:
+\begin{verbatim}
+ <files>
+    <!-- The file 'id' is used as a reference for 
+         other configuration components -->
+    <file id="boot-cd" filename="/home/jarusl/image.iso" />
+    <!--<file id="harddisk" filename="firefox.img" />-->
+ </files>
+\end{verbatim}
 
+For careful, repeatable experimentation, it is often convenient to
+build your own simplified Linux guest image.  It will boot {\em much}
+faster than a full blown distribution and you can readily set up an
+environment in which you can exert very tight control, being able to
+modify the Linux kernel, the included files (e.g., benchmarks), and
+other components very rapidly.  To learn more about how to do this,
+please consult the separate document named {\em Building a bootable
+guest image for Palacios and Kitten}.
 
 \section{Running Palacios/Kitten}
-Kitten and Palacios are capable of running under Qemu, which makes
-debugging much simpler.
+Kitten and Palacios are capable of running under QEMU, which makes
+debugging much simpler.  QEMU is a user-level Linux or Windows program
+that emulates a PC machine. 
 
-The basic form of the command to start the Qemu emulator is:
+The basic form of the command to start the QEMU emulator is:
 \begin{verbatim}
 /usr/local/qemu/bin/qemu-system-x86_64 -smp 1 -m 1024 \
         -serial file:./serial.out \
@@ -283,35 +582,48 @@ The basic form of the command to start the Qemu emulator is:
         < /dev/null
 \end{verbatim}
 
-The command starts up a single processor emulated machine, with 1gig
-of RAM and a cdrom drive loaded with the Kitten ISO image. Furthermore
-all output to the serial port is written directly to a file called
-{\em serial.out}. This command can be copied into a shell script for easy access.
+The command starts up a single processor emulated machine, with 1GB of
+RAM and a CD-ROM drive loaded with the Kitten ISO image.  All output
+to the serial port is written directly to a file called {\em
+  serial.out}. This command can be copied into a shell script for easy
+access.
+
+We can also run Palacios/Kitten on physical hardware.  The slow way is
+to burn the Kitten ISO image onto a CD ROM and then boot the test
+machine with it.  The much faster way is to set the test machine up to
+use the PXE network boot system (most modern BIOSes support this), and
+boot your Kitten image over the network.  The debugging output will
+then appear on the actual serial port of the physical machine.  For
+the Northwestern environment, please talk to Jack Lange or Peter Dinda
+if you need to be able to do this.  Northwestern has a range of AMD
+and Intel boxes for testing, as do UNM and Sandia.    A different form
+of network boot is used for Red Storm. 
+
 
 \section{Development Guidelines}
 
 There are standard requirements we have for code entering the mainline. 
 
-First and foremost, Palacios is designed to be OS indenpendent and
-support 32 and 64 bit architectures. This means that developers should
+First and foremost, Palacios is designed to be OS independent and
+support 32-bit and 64-bit architectures. This means that developers should
 not include any external OS specific dependencies in any Palacios
-component. Also all changes need to be tested on both 32 and 64 bit
-architectures to make sure that they compile as well as run corrrectly.
+component. Also all changes need to be tested on both 32-bit and 64-bit
+architectures to make sure that they compile as well as run correctly.
 
 \paragraph*{Coding Style}
 
-"The use of equal negative space, as a balance to positive space, in a
+``The use of equal negative space, as a balance to positive space, in a
 composition is considered by many as good design. This basic and often
-overlooked principle of design gives the eye a "place to rest,"
-increasing the appeal of a composition through subtle means."
+overlooked principle of design gives the eye a 'place to rest,'
+increasing the appeal of a composition through subtle means.''
 \newline\newline
-Translation: Use the spacebar, newlines, and parentheses. 
+Translation: Use the space bar, newlines, and parentheses. 
 
 Curly-brackets are not optional, even for single line conditionals. 
 
 Tabs should be 4 characters in width.
 
-{\em Special:} If you are using XEmacs add the following to your \verb.init\.el. file:
+{\em Special:} If you are using XEmacs add the following to your \verb!init.el! file:
 \begin{verbatim}
 (setq c-basic-offset 4)
 (c-set-offset 'case-label 4)
@@ -324,7 +636,6 @@ if(a&&b==5||c!=0) return;
 
 
 {\em Good}
-
 \begin{verbatim}
 if (((a) && (b == 5)) || 
     (c != 0)) {
@@ -335,7 +646,7 @@ if (((a) && (b == 5)) ||
 
 
 \paragraph*{Fail Stop}
-Because booting a basic linux kernel results in over 1 million VM exits
+Because booting a basic Linux kernel results in over 1 million VM exits
 catching silent errors is next to impossible. For this reason
 ANY time your code has an error it should return -1, and expect the
 execution to halt. 
@@ -363,18 +674,17 @@ are defined should be declared as static and not included in the
 header file. You should make an effort to use static functions
 whenever possible. 
 
-\item \verb.v3_. prefix
-\newline
-Major interface functions should be named with the prefix \verb.v3_. This
-xallows easy understanding of how to interact with the subsystems. And
-in the case that they need to be externally visible to the host os,
-make them unlikely to collide with other functions. 
+\item \verb.v3_. prefix \newline Major interface functions should be
+  named with the prefix \verb.v3_. This allows easy understanding of
+  how to interact with the subsystems.  In the case that they need to
+  be externally visible to the host OS, make them unlikely to collide
+  with other functions.
 \end{enumerate}
 
 \paragraph*{Debugging Output}
-Debugging output is sent through the host os via functions in the
+Debugging output is sent through the host OS via functions in the
 \verb.os_hooks. structure. These functions have various wrappers of the form
-\verb.Print*., with printf style semantics. 
+\verb.Print*., with \texttt{printf}-style semantics. 
 
 Two functions of note are \verb.PrintDebug. and \verb.PrintError..
 
@@ -414,7 +724,7 @@ other is not a viable option either. Furthermore, once a patch is
 applied to the mainline, it will generate a conflicting commit that
 will become present when the developer next pulls from the central
 repository. This can result in both repositories getting out of
-sync. It is possible to deal with this by manually rebasing the local
+sync. It is possible to deal with this by manually re-basing the local
 repository, but it is difficult and error-prone. 
 
 This approach also does not map well when patches are being revised. A
@@ -427,7 +737,7 @@ mainline.
 For this reason most internal developers should seriously consider
 {\em stacked git}. Stacked git is designed to make patch development
 easier and less of a headache. The basic mode of operation is for a
-developer to intialize a patch for a new feature and then continuously
+developer to initialize a patch for a new feature and then continuously
 apply changes to the patch. Stacked Git allows a developer to layer a
 series of patches on top of a local git repository, without causing
 the repository to unsync due to local commits. Basically, the
@@ -445,8 +755,8 @@ newskysaw} in \verb./opt/vmm-tools/bin/.
 
 Brief command overview:
 \begin{itemize}
-\item \verb.stg init. -- Initialize stacked git in a given branch
-\item \verb.stg new. -- create a new patch set, an editor will open
+\item \verb.stg init. -- initialize stacked git in a given branch
+\item \verb.stg new. -- create a new patch set; an editor will open
 asking for a commit message that will be used when the patch is
 ultimately committed.
 \item \verb.stg pop. -- pops a patch off of the source tree.
@@ -455,14 +765,14 @@ ultimately committed.
 that can then be emailed.
 \item \verb.stg refresh. -- commits local changes to the patch set at
 the top of the applied stack.
-\item \verb.stg fold. -- Apply a patch file to the current
+\item \verb.stg fold. -- apply a patch file to the current
 patch. (This is how you can manage revisions that are made by other developers).
 \end{itemize}
 
-You should definately look at the online documentation to better
+You should definitely look at the online documentation to better
 understand how stacked git works. It is not required of course, but if
-you want your changes to be applied its up to you to generate a patch
-that is acceptable to a core developer. Ultimately using Stacked git
+you want your changes to be applied it's up to you to generate a patch
+that is acceptable to a core developer. Ultimately, using Stacked git
 should be easier than going it alone.
 
 
@@ -527,64 +837,188 @@ hg qpush -a
 
 \section{Networking}
 
-\section{Configuring the development host's Qemu network}
-Set up Tap interfaces:
-
-/root/util/tap\_create tapX
-
-Bridging tapX with eth1 will only work (work = send packet and also
-make packet visible on localhost) if the IP address is set correctly
-(correctly = match network it is connected to e.g., network of eth1)
-so bring up the network inside of the VM / QEMU as 10-net, and it
-should route through the eth1 rule and be visible both on the host and
-in the physical network
-
+Both the Kitten and GeekOS substrates on which Palacios can run
+currently include drivers for two simple network cards, the NE2000,
+and the RTL8139.  The Kitten substrate is acquiring an ever increasing
+set of drivers for specialized network systems.   A lightweight
+networking stack is included so that TCP/IP networking is possible
+from within the host OS kernel and in Palacios.  
+
+When debugging Palacios on QEMU, it is very convenient to add an
+RTL8139 card to your QEMU configuration, and then drive it from within
+Palacios.  QEMU can be configured to provide local connectivity to the
+QEMU emulated machine, including bridging the emulated machine with a
+physical network.  Local connectivity can be done with redirection, or
+with a TAP interface.  For global connectivity, a TAP interface must
+be used; it is bridged to a physical interface.
+
+\section{Configuring the development host's QEMU network}
+
+To get local connectivity with redirection, no networking changes on
+the host are needed.  However, people usually want to use TAP-based
+networking, which does require changes.  For one thing, TAP interfaces
+can be inspected with tools like wireshark, which makes for much
+easier debugging of network code.
+
+In order to get QEMU networking to function, it is necessary to create
+TAP interfaces, and, optionally, to bridge them to real networks.  A
+development machine typically will have several TAP interfaces, and
+more can be created.  Generally, each developer should have a TAP
+interface of his or her own.  In the following, we will use our
+development machine, newskysaw, as an example.
+
+To set up a TAP interface on newskysaw, the following command is used:
+\begin{verbatim}
+/root/util/tap_create tapX
+\end{verbatim}
 
-\subsection{Configuring Kitten}
+When QEMU runs with a tap interface, it will use /etc/qemu-ifup to
+bring up the interface.  On newskysaw, /etc/qemu-ifup looks like this:
 
-To enable networking in Qemu, networking needs to be enabled in the configuration.
+\begin{verbatim}
+#!/bin/bash
+echo "Executing /etc/qemu-ifup - no external bridging"
+echo "Bringing up $1 for bridged mode..."
+NET=`echo $1 | cut -dp -f2` 
+sudo /sbin/ifconfig $1 172.2${NET}.0.1 up
+sleep 2
+\end{verbatim}
 
-Make sure turn on the network device driver, networking, and input
-kernel command 'console=serial net=rtl8139'
+The interface tap$N$ is brought up with the IP address 172.2$N$.0.1.
+ifconfig will also create a routing rule that sends 172.2$N$.0.1/16
+traffic to tap$N$.  The upshot is that if the code running in QEMU
+uses an IP address in this network (for example: 172.2$N$.0.2), you
+will be able to talk to it from newskysaw.  For example, from
+newskysaw, if you ping 172.21.0.2, the packet (and ARP) will go out via
+tap1.  The source address will appear to be 172.21.0.1.  The QEMU
+machine will see these packets on its interface, and the software
+controlling its interface can respond to 172.21.0.1.  
+
+This form of networking is local to the machine.  You can also bridge
+a TAP interface with a physical interface.  The result of this is that
+a packet sent on it will be sent on the physical interface.  To do
+this requires more effort (and is not set up by default on newskysaw).
+As an example, consider that on newskysaw, the physical interface eth1
+is connected to a private network switch to which the lab test
+computers (v-test-amd, v-test-amd2, etc.) are connected.  To bridge,
+for example, tap10, to this interface, you would do the following
+(with root's help):
+\begin{enumerate}
+\item You need to bring up eth1 (ifconfig eth1 up {\em address}
+netmask {\em mask}).  It is important that the address and mask you
+choose are appropriate for the network eth1 is connected to.
+\item You would bring up tap10 without an address:  /sbin/ifconfig
+tap10 up
+\item You would bridge tap10 and eth1:  /usr/sbin/brctl addif br0
+tap10; /usr/sbin/brctl addif eth1.  This assumes that br0 was
+previously created. 
+\end{enumerate}
 
-How to set ip address in kitten:
+Bridging tap$N$ with eth1 will only work (where ``work'' means sending
+a packet on the network and making the packet visible on localhost) if
+the IP address in the code running in QEMU is set correctly.  This
+means that it needs to be set to correspond to the network of eth1).  
+For the newskysaw configuration, this is a 10-net address.
 
-Kitten ip address setting is in file drivers/net/ne2k/rtl8139.c, in the code below which is located in function rtl8139\_init.
 
-  struct ip\_addr ipaddr = { htonl(0 | 10 << 24 | 0 << 16 | 2 << 8 | 16 << 0) }; 
-  struct ip\_addr netmask = { htonl(0xffffff00) }; 
-  struct ip\_addr gw = { htonl(0 | 10 << 24 | 0 << 16 | 2 << 8 | 2 << 0) };
+\subsection{Configuring Kitten}
 
-This sets the ip address as 10.0.2.16, netmask 255.255.255.0 and gateway address 10.0.2.2, change it as you need.
+Kitten needs to be explicitly configured to use networking. Currently
+only a subset of the networking configurations are supported. To
+enable an Ethernet network you should enable the following options:
 
+\begin{itemize}
+\item Enable TCP Support
+\item Enable UDP Support
+\item Enable socket API
+\item Enable ARP support
+\end{itemize}
 
+The other options are not supported, and enabling them will probably
+break the kernel compilation.
 
-\subsection{Running with networking}
+To allow Kitten to communicate with the QEMU network card you also
+need to enable the appropriate device driver: \newline
+\verb.NE2K Device Driver (rtl8139).
 
-\paragraph*{Tap Interface}
-In which, the command line: 
+The driver then needs to be listed as a Kernel Command Line argument
+in the {\em ISOIMAGE configuration}. To do this add
+\verb.net=rtl819. to the end of the argument string.
 
--net tap, ifname=tap2
+Kitten currently does not support the dynamic assignment or IP
+addresses at runtime. Because of this it is necessary to hardcode the
+IP address into the device driver. For the rtl8139 network driver look
+in the file {\em drivers/net/ne2k/rtl8139.c} for the function
+\verb.rtl8139_init..
 
-specifies Qemu to use the host's tap0 as its network interface, then Qemu can access the host's physical network.
+There should be a block of code that looks like the following:
+\begin{verbatim}
+  struct ip_addr ipaddr = { htonl(0 | 10 << 24 | 0 << 16 | 2 << 8 | 16 << 0) }; 
+  struct ip_addr netmask = { htonl(0xffffff00) }; 
+  struct ip_addr gw = { htonl(0 | 10 << 24 | 0 << 16 | 2 << 8 | 2 << 0) };
+\end{verbatim}
 
-\paragraph*{Redirection}
+This sets the IP address as 10.0.2.16, netmask 255.255.255.0 and
+gateway address 10.0.2.2. Change these assignments to match your configuration.
 
-Also you can use the following command instead to redirect host's 9555 port to Qemu's 80 port.
 
--net user -net nic,model=rtl8139  -redir tcp:9555::80
+\paragraph*{Kitten as the Guest OS}
 
-In this case, you can access Qemu's 80 port in the host like:
+When running Kitten as a VM, the above applies except that you will
+want to enable the {\em VMNET} device driver instead of the {\em rtl8139}.
 
-telnet localhost 9555
 
-Qemu has many options to build up a virtual or real networking. See http://www.h7.dion.ne.jp/~qemu-win/HowToNetwork-en.html for more information.
+\subsection{Running with networking}
 
+\paragraph*{TAP Interface}
+Running with a TAP interface provides either local or global
+connectivity (depending on how the TAP interface is configured and/or
+bridged).  From the perspective of the QEMU command line, both look
+the same, however.  You simply add something like this to the command
+line:
+\begin{verbatim}
+-net tap,ifname=tap2 -net nic,model=rtl8139
+\end{verbatim}
+The first \verb.-net. option indicates that you want to use a tap
+interface, specifically \verb.tap2..   The second \verb.-net. option
+specifies that this interface will appear to code in the QEMU machine
+to be a network interface card of the specific model RTL8139.  Note
+that this is a model for which we have a driver.  If tap2 were
+bridged, we'd get global connectivity.  If not, we would just get
+local connectivity.  
 
 
+\paragraph*{Redirection}
+It is also possible to achieve limited local connectivity even if you
+have no TAP support on your development machine.  In redirection, QEMU
+essentially acts as a proxy, translating TCP or other connections and
+low-level packet operations on the network interface in the QEMU
+machine.  For example, the following options will redirect the host's
+9555 port to the QEMU machine's 80 port:
+\begin{verbatim}
+-net user -net nic,model=rtl8139  -redir tcp:9555:10.10.10.33:80
+\end{verbatim}
+The first \verb.-net. option indicates that we are using user-level
+networking (proxying).  The second \verb.-net. option indicates that
+this user-level network will appear in the QEMU machine as an RTL8139
+network card.   The \verb.-redir. option indicates that connections on
+localhost:9555 will be translated into equivalent packet exchanges on
+the RTL8139 card in the QEMU machine.  However, we have to tell QEMU
+which IP address and port to use on the QEMU machine's side.  This is
+what the 10.10.10.33 address, and port 80 are.  In the example, if you
+access port 9555 on localhost, say with:
+\begin{verbatim}
+telnet localhost 9555
+\end{verbatim}
+The packets that appear in the QEMU machine will be bound for
+10.10.10.33, port 80.  Within the QEMU machine, your RTL8139 interface
+had better then be up on that address. 
 
+QEMU has many options to build up virtual or real networking. See
+http://www.h7.dion.ne.jp/$\sim$qemu-win/HowToNetwork-en.html for more
+information.
 
 
-For more questions, talk to Jack or Lei.
+For more questions, talk to Jack, Lei,  or Peter.
 
 \end{document}