\setlength\parindent{0in}
\setlength\parskip{0.1in}
+\newcommand{\note}[1]{{$\rightarrow$ \bf Note: \emph{#1}}}
+
\begin{document}
\title{
\includegraphics[height=1.5in]{v3vee.pdf}
\includegraphics[height=1.5in]{logo6.png} \\
\vspace{0.5in}
-Palacios Internal Developer Manual
+Palacios Internal and External Developer Manual
}
\maketitle
-This manual is written for Internal Palacios developers. It contains
-information on how to obtain the palacios code base, how to go about
-the development process, and how to commit those changes to the
-mainline source tree. This assumes that the reader has read {\em An
-Introduction to the Palacios Virtual Machine Monitor -- Release 1.0}
-and also has a slight working knowledge of {\em git}.
-
+This manual is written for internal and external Palacios
+developers. It contains information on how to obtain the Palacios code
+base, how to go about the development process, and how to commit those
+changes to the mainline source tree. This assumes that the reader has
+read the technical report {\em An Introduction to the Palacios Virtual
+Machine Monitor -- Release 1.0}\footnote{It's important to note that
+there have been substantial changes in the build process from 1.0 to
+1.2 and beyond. Hence, the technical report is primarily useful as an
+explanation of the theory of operation of Palacios. This document is
+the one to consult for the build process.} and also has a slight
+working knowledge of {\em git}. You will also want to read the
+document {\em Building a bootable guest image for Palacios and Kitten}
+in order to understand how to build an extremely lightweight guest
+environment, suitable for testing. If you want to configure network
+booting for testing on real hardware, you'll want to read the document
+{\em Booting Palacios/Kitten Over the Network Using PXE}.
+
+Please note that Palacios and Kitten are under rapid development,
+hence this manual may very well be out of date!
+
+\newpage
+\tableofcontents
+\newpage
+\listoffigures
+\newpage
\section{Overview}
models. A central repository exists that holds the master version of
the code base. This central repository is cloned by multiple people
and in multiple places to support various development efforts. A
-feature of git is that every developer actually has a fully copy of
+feature of \texttt{git} is that every developer actually has a full copy of
the entire repository, and so can function independently until such
-time as they need to resync with the master version.
+time as they need to re-sync with the master version.
There are typically multiple levels of access to the central
repository, that are granted based on the type of developer being
privileges are:
\begin{itemize}
-\item Core Developers: These are the lead developers and are in
+\item Core developers: These are the lead developers and are in
charge of managing the master repository. They have full read/write
access permissions to the central repository.
-\item Internal Developers: Formal members of the development
+\item Internal developers: Formal members of the development
team. These people are capable of pulling directly from the central
repository, but lack the ability to write directly to it.
-\item External Developers: People who are not actual members of the
+\item External developers: People who are not actual members of the
development team. These people can only access the public repository
which is only updated to contain the release versions.
\end{itemize}
+Students doing independent study or REUs related to Palacios are set
+up as internal developers. EECS 441 (Resource Virtualization)
+students are generally either set up as internal or external
+developers, depending on their projects.
+
Because internal and external developers cannot write directly to the
master repository, they need to first submit their changes to a core
developer before they can be added to the mainline. We will discuss
\section{Checking out Palacios}
-The central palacios repository is located on {\em
+
+The central Palacios repository is located on {\em
newskysaw.cs.northwestern.edu} in {\em /home/palacios/palacios}. All
internal developers have read access to the directory. Each developer
must create their own local version of the repository, this is done
git clone /home/palacios/palacios
\end{verbatim}
-This creates a local copy of the repository at {\em ./palacios/}.
+On the machine {\em newbehemoth.cs.northwestern.edu} you will want to
+use the following command instead. The {\em newskysaw} home
+directories are NFS-mounted on /home-remote.
+\begin{verbatim}
+git clone /home-remote/palacios/palacios
+\end{verbatim}
-All development work is done in the {\em devel} branch of the
-repository. The developer can access this branch via:
+
+On any other machine, you can clone the repository via ssh, provided
+you have a newskysaw account:
+
+\begin{verbatim}
+git clone ssh://you@newskysaw.cs.northwestern.edu//home/palacios/palacios
+\end{verbatim}
+
+External developers can clone the public repository via the web. The
+public repository tracks the release and main development branch
+(e.g., devel) of the internal repository with a 30 minute delay. The
+access information is available on the web site (http://v3vee.org).
+The web site also includes a publically accessible gitweb interface to
+allow browsing of the public repository. The clone command looks like
+
+\begin{verbatim}
+git clone http://v3vee.org/palacios/palacios.web/palacios.git
+\end{verbatim}
+
+No matter how you clone, the clone command creates a local copy of the
+repository at {\em ./palacios/}.
+
+Note that both {\em newskysaw} and {\em newbehemoth} have all the
+tools installed that are needed to build and test Palacios and Kitten.
+If you develop on another machine, you will need to set those tools up
+for yourself. This isn't hard and the tools are all free. See the
+technical report for what tools you will need.
+
+When you first clone the repository, you will get the {\em master}
+branch, which is used to generate releases. All development work is
+done in the {\em devel} branch of the repository. The developer can
+access this branch via:
\begin{verbatim}
git checkout --track -b devel origin/devel
/opt/vmm-tools/bin/checkout_branch devel
\end{verbatim}
-{\em Important:}
-Note that palacios is very actively developed so the contents of the
-{\em devel} branch are frequently changing. In order to keep up to
-date with the latest version, it is necessary to periodically pull the
-latest changes from the master repository by running \verb.git pull..
+
+{\em Important:} Note that Palacios is very actively developed so the
+contents of the {\em devel} branch are frequently changing (and
+broken!). In order to keep up to date with the latest version, it is
+necessary to periodically pull the latest changes from the master
+repository by running \verb.git pull..
+
+
+The released versions of Palacios are, currently, 1.0, 1.1, and 1.2.
+To switch to the current release branch, execute
+
+\begin{verbatim}
+git checkout --track -b Release-1.2 origin/Release-1.2
+\end{verbatim}
+
+or
+
+\begin{verbatim}
+/opt/vmm-tools/bin/checkout_branch Release-1.2
+\end{verbatim}
+
\section{Checking out Kitten}
Kitten is available from Sandia National Labs, and is the main host OS
-we are targetting with Palacios. Loosely speaking core Palacios
+we are targeting with Palacios. Loosely speaking, core Palacios
developers are internal Kitten developers, and internal Palacios
-developers are external Kitten developers. Because we have limited
-access to the Kitten repository, we are maintaining a local mirror
-copy in {\em /home/palacios/kitten}.
+developers are external Kitten developers. The public repository for
+Kitten is at {\em http://code.google.com/p/kitten}. To simplify
+things, we are maintaining a local mirror copy on newskysaw in {\em
+/home/palacios/kitten} that tracks the public repository.
-Kitten uses Mercurial for their source management, so you will have to
-make sure the local mercurial version is configured correctly.
-Specifically you should add the following python path to your shell environment.
+Kitten uses Mercurial for source management, so you will have to make
+sure the local Mercurial version is configured correctly.
+Specifically you will probably need to add something like the
+following Python path to your shell environment.
\begin{verbatim}
export PYTHONPATH=/usr/local/lib64/python2.4/site-packages/
\end{verbatim}
-You can then clone Kitten from the local mirror:
+You can then clone Kitten from the local mirror. On {\em newskysaw},
+run:
\begin{verbatim}
hg clone /home/palacios/kitten
\end{verbatim}
+On {\em newbehemoth}, run
+\begin{verbatim}
+hg clone /home-remote/palacios/kitten
+\end{verbatim}
+On other machines, run
+\begin{verbatim}
+hg clone ssh://you@newskysaw.cs.northwestern.edu//home/palacios/kitten
+\end{verbatim}
+External developers, run
+\begin{verbatim}
+hg clone https://kitten.googlecode.com/hg/ kitten
+\end{verbatim}
Both the Kitten and Palacios clone commands should be run from the
-same direcotyr. This means that both repositories should be located at
+same directory. This means that both repositories should be located at
the same directory level. The Kitten build process depends on this.
-{\em Important:} Like Palacios, Kitten is very actively developed so
-source tree is frequently changing. In order to keep up to date with
-the latest version, it is necessary to periodically pull the latest
-changes from the mirror repository by running \verb.hg pull. followed
-by \verb.hg update..
+{\em Important:} Like Palacios, Kitten is under active development,
+and its source tree is frequently changing. In order to keep up to
+date with the latest version, it is necessary to periodically pull the
+latest changes from the mirror repository by running \verb.hg pull.
+followed by \verb.hg update..
+
+The current release of Kitten, which will work correctly with the current 1.2 release of Palacios is 1.2.0.
+To switch to the current release branch, execute
+
+\begin{verbatim}
+hg checkout release-1.2.0
+\end{verbatim}
+
\section{Compiling Palacios}
-Palacios is capable of targeting 32 and 64 bit operating systems, and
-includes a build process that supports both these
-architectures. Furthermore, Palacios has multiple build locations,
-with multiple makefiles: a top level build directory and a Palacios
-specific build directory. The Palacios build process first generates a
-static library that includes the Palacios VMM. This static library is
-then linked into a host operating system. Palacios internally supports
-GeekOS and can generate a complete OS image via a unified build
-process. To combine Palacios with Kitten, it is necessary to first
-compile Palacios and then to compile Kitten externally link it with
-Palacios. The output of the compilation process is a bit more complex
-and generates multiple binaries, and the specifics can be found in the
-Makefiles.
-
-The top level build directory provides a number of high level make
-targets, and is located in {\em palacios/build/}. It supports building
-32 and 64 bit versions of the Palacios library independently as well
-as building an integrated version of GeekOS. The basic targets are:
+
+The Palacios build process has been changed recently from a homegrown
+environment to the widely used KBuild environment. KBuild is also
+used for building Kitten. Because KBuild is the build environment
+used for Linux, much of what you learn about configuring and building
+Linux kernels is readily applicable to Palacios and Kitten.
+
+The output of the Palacios build process is a static library that
+includes the Palacios VMM and relevant guest support code blocks. This
+static library is then linked into a host operating system. Palacios
+internally supports GeekOS and can generate a complete OS image via a
+unified build process. By complete OS image, we mean an ISO image
+containing GeekOS, Palacios, and a guest image (another ISO image) for
+testing.
+
+These days, however, Palacios is typically embedded into Kitten, not
+GeekOS. To combine Palacios with Kitten, it is necessary to first
+configure and build Palacios, then to configure and build Kitten,
+linking in Palacios. Kitten can also be configured to link in a
+guest image for testing.
+
+\subsection{Configuration}
+
+To configure Palacios, enter the top level Palacios directory and
+execute:
+\begin{verbatim}
+make clean
+make xconfig
+\end{verbatim}
+At this point, you will see be presented with a KBuild configuration
+screen, similar to what you would see in configuring a Linux kernel.
+Palacios has far fewer options, however. If you don't have X or
+don't want the graphical configuration system, you can also use the
+\verb.menuconfig. or \verb.config. targets. The available options
+change over time, so we do not cover all of them here, but here are a
+few that are usually important. We note how to set these options to
+configure a minimal VMM in the following.
+\begin{itemize}
+\item Target Configuration:
\begin{itemize}
-\item \verb.make palacios-full32. -- Generates a 32 bit version of the Palacios static library
-\item \verb.make palacios-full64. -- Generates a 64 bit version of the
-Palacios static library
-\item \verb.make geekos. -- Compiles the GeekOS kernel, and link it with the
-Palacios static library
-\item \verb.make geekos-iso. -- Generate an ISO boot disk image from the
-GeekOS kernel that has been compiled
-\end{itemize}
-The second build directory is located at {\em palacios/palacios/build}
-and handles only the Palacios compilation process. It supports a
-differnt set of targets and arguments:
+\item Red Storm (Cray XT3/XT4) --- turn on to target Cray XT4
+supercomputers. (off)
+\item AMD SVM Support --- targets AMD processors with the SVM hardware
+virtualization features (on)
+\item Intel VMX support --- targets Intel processors with the VMX
+hardware virtualization features (on)
+\item Compile for a multi-threaded OS (on)
+\item Enable VMM telemetry support --- this is lightweight logging and
+data collection (on)
+\item Enable VMM instrumentation --- this is heavyweight logging and
+data collection (off)
+\item Enable passthrough video --- this lets a guest write directly to
+the video card (off) (this is outdated and handled in a different way now)
+\item Enable experimental options --- this makes it possible to select
+features that are under current development (off). You probably want
+to leave leave this all off. The VNET suboption is for an experimental VMM-embedded
+overlay network under development by Lei Xia and Yuan Tang.
+\item Enable built-in versions of stdlib functions --- this adds
+needed stdlib functions that the host OS may not supply. For use with
+Kitten turn on and enable strcasecmp() and atoi().
+\item Enable built-in versions of stdio functions (off)
+\end{itemize}
+\item Symbiotic Functions (these are experimental options for Jack
+Lange's thesis).
\begin{itemize}
-\item \verb.make ARCH=32. -- iteratively compiles a 32 bit version of Palacios
-\item \verb.make ARCH=64. -- iteratively compiles a 64 bit version of
-Palacios
-\item \verb.make ARCH=32 world. -- fully recompiles a 32 bit version of
-Palacios
-\item \verb.make ARCH=64 world. -- fully recompiles a 64 bit version of
-Palacios
+\item Enable Symbiotic Functionality --- This adds symbiotic features
+to Palacios, specifically support for discovery and configuration by
+symbiotic guests, the SymSpy passive information interface for
+asynchronous symbiotic guest $\leftrightarrow$ symbiotic VMM
+information flow, and the SymCall functional interface for synchronous
+symbiotic VMM $\rightarrow$ symbiotic guest upcalls. (off)
+\item Symbiotic Swap --- Enables the SwapBypass symbiotic service for
+symbiotic Linux guests. (off)
\end{itemize}
-
-Both build levels support compilation directives that control the
-debugging messages that are generated by Palacios. These are specified
-by appending a \verb.DEBUG_<COMPONENT>=1. to the end of the
-\verb.make. command. The components that are currently supported are:
+\item Debug Configuration
\begin{itemize}
-\item \verb.DEBUG_ALL=1. -- enables debugging for all the VMM components
-({\em Warning:} this generates a {\em lot} of debug information.
-\item \verb.DEBUG_SHADOW_PAGING=1.
-\item \verb.DEBUG_CTRL_REGS=1.
-\item \verb.DEBUG_INTERRUPTS=1.
-\item \verb.DEBUG_IO=1.
-\item \verb.DEBUG_KEYBOARD=1.
-\item \verb.DEBUG_PIC=1.
-\item \verb.DEBUG_PIT=1.
-\item \verb.DEBUG_NVRAM=1.
-\item \verb.DEBUG_GENERIC=1.
-\item \verb.DEBUG_EMULATOR=1.
-\item \verb.DEBUG_RAMDISK=1.
-\item \verb.DEBUG_XED=1.
-\item \verb.DEBUG_HALT=1.
-\item \verb.DEBUG_DEV_MGR=1.
-\item \verb.DEBUG_APIC=1.
+\item Compile with Debug info --- adds debug symbols (-g) (off)
+\item Enable Debugging --- makes it possible to show PrintDebug output
+(on). You can selectively turn on debugging output for each major
+VMM component, including shadow paging, nested paging, control
+registers, interrupts, I/O, instruction emulation and XED, halt, and
+the device manager. Note that the more debugging output you turn on,
+the slower the VMM will go since it will have to wait for the prints
+to finish.
+\end{itemize}
+\item BIOS Selection --- Lets you select which code blobs will be
+used for bootstrapping the guest. There are currently three: a
+BIOS, a Video BIOS, and the VMXAssist V8086 service (the latter is used only on
+Intel VMX). Generally, you should not need to change these.
+\item Virtual Devices --- virtual devices can be instantiated and
+added to a guest. The following is a list of the currently
+implemented virtual devices.
+\begin{itemize}
+\item BOCHS Debug Console Device --- used for debugging output from
+the guest BIOS (on)
+\item OS Debug Console Device --- used for debugging output from
+the guest kernel (on)
+\item 8259A PIC - legacy Programmable Interrupt Controller chip --- used
+for bootstrap of most guests (on)
+\item APIC - in-processor Advanced Programmable Interrupt Controller --- used
+for interrupt delivery on almost all guests (on)
+\item IOAPIC - Off-chip APIC --- used for interrupt deliver for almost
+all guests (on)
+\item PIT - legacy 8254 timer (on)
+\item i440fx Northbridge --- emulation of a typical PC North Bridge
+chip, used on almost all guests (on)
+\item PIIX3 Southbridge --- emulation of a typical PC South Bridge
+chip, used on almost all guests (on)
+\item PCI --- emulation of a PCI bus - needed for attaching most
+devices (on)
+\begin{itemize}
+\item Passthrough PCI --- allows us to make a hardware PCI device visible and
+directly accessible by the guest (off)
+\end{itemize}
+\item Generic --- this is a run-time configurable device that can intercept I/O port read/writes and memory region reads/writes. Intercepted reads and writes can either be ignored or forwarded to actual hardware, and the data flow can optionally be printed. This is a useful tool with at least three purposes. First, it makes it possible to ``stub out'' hardware that isn't currently implemented and for which we don't want to allow passthrough access. Second, it makes it possible to provide low-level passthrough access to physical hardware. Third, it can be used to spy on guest/device interactions, which is very helpful when trying to understand the interface of a device.
+\item NVRAM - motherboard configuration memory --- needed by BIOS bootstrap (on)
+\item Keyboard - Generic PS/2 keyboard, including mostly broken mouse
+implementation (on)
+\item IDE --- Support for virtual IDE controllers that support disks
+and CD ROMs (on)
+
+\item NE2K - NE2000 and RTL8139 network devices (off)
+\item CGA - CGA video card (paritial implementation) (off)
+\begin{itemize}
+\item Telnet Virtual Console (off) When CGA and Telnet Console are
+on, it is possible to telnet to the console of the guest. Eventually
+the rest of this will provide simple bitmapped video console for VNC
+access.
+\end{itemize}
+\item RAMDISK storage backend --- used to create RAM disk
+implementations of block devices (on)
+\item NETDISK storage backend --- used to create network-attached disk
+implementations of block devices, e.g., network block devices (off)
+\item TMPDISK storage backend --- used to create temporary storage
+implementations of block devices (on)
+\item Linux Virtio Balloon Device --- used for memory ballooning by
+Linux virtio-compatible guests (off)
+\item Linux Virtio Block Device --- used for fast block device support
+by Linux virtio-compatible guests (off)
+\item Linux Virtio Network Device --- used for fast network device support
+by Linux virtio-compatible guests (off)
+\item Linux Virtio Symbiotic Device (off)
+\item Symbiotic Swap Disk (multiple versions) --- used for the
+SwapBypass service (off)
+\item Disk Performance Model --- used for the
+SwapBypass service (off)
+\end{itemize}
\end{itemize}
+\subsection{Compilation}
+
+After configuring Palacios---remember to save your changes---you can
+compile it by executing
+\begin{verbatim}
+make
+\end{verbatim}
+This will produce the file {\em libv3vee.a} in the current directory
+This static library contains the Palacios VMM and is ready for
+embedding into an OS, such as Kitten. The library provides the
+ability to instantiate and run virtual machines. By default, on a 64
+bit machine, the library is compiled for 64 bit machines (x86\_64),
+while on a 32 bit machine, it is compiled for 32 bit machines. You
+can override this using the ARCH=i386 or ARCH=x86\_64 arguments to the
+make, provided you have the relevant tools available. The 64 bit
+version is what you need for use with Kitten. A 64 bit Palacios can
+run both 64 and 32 bit guests. Both {\em newskysaw} and {\em
+newbehemoth} are 64 bit machines.
+
+The compilation process will also create the utility {\em build\_vm}
+(which builds guest images from XML description files), and a very
+simple guest image called {\em guest\_os.img} that essentially contains
+a Linux kernel and BusyBox. The default Kitten configuration will use
+this guest image.
\section{Compiling Kitten}
-Kitten requires a 64 bit version of Palacios, so make sure that
-Palacios has been correctly compiled before compiling Kitten.
+Kitten requires a 64-bit version of Palacios, so make sure that
+Palacios has been correctly compiled before compiling Kitten. The
+current default for Palacios is 64 bit.
\subsection{Configuration}
Kitten borrows a lot of concepts from Linux, including the Linux build
\item \verb.make menuconfig.
\end{itemize}
-There are some specific configuration options that should be disabled
-to work with Palacios. Because Palacios is configured by default to
-provide a guest with direct access to the VGA console, the {\em VGA
-console} device driver should be disbabled in the Kitten
-configuration. Similarly the {\em VM console} driver should be
-disabled as well.
+Of course, there are a range of configuration options. In the
+following, we note only the most important. The indicated values are
+defaults for the simplest interaction between Kitten and Palacios.
+\begin{itemize}
+\item Target Configuration
+\begin{itemize}
+\item System Architecture --- you probably want to set this to
+PC-Compatible, unless you are working on Red Storm.
+\item Processor Family --- you want to set this to either
+AMD-Opteron/Athlon64 or Intel-64/Core2, depending on whether you have
+a 64 bit AMD or 64 bit Intel processor.
+\end{itemize}
+\item Virtualization
+\begin{itemize}
+\item Include Palacios VMM --- this will link against the Palacios
+library (on)
+\item Path to pre-built Palacios tree --- directory where libv3vee.a
+can be found.
+\item Path to guest image --- location of the test guest OS
+image that will be embedded. We will say more about this later.
+Essentially, however, a guest image consists of a blob that begins
+with an XML description of the desired guest environment and the
+contents of the remainder of the blob. The remainder of the blob
+usually contains disk or cd images. The default path is
+../palacios/guest\_os.img, where it will find the simple guest created
+during the Palacios build process.
+\end{itemize}
+\item Networking
+\begin{itemize}
+\item Enable LWIP TCP/IP stack. This activates a simple TCP/IP stack
+that things like NETDISK can use. (off)
+\end{itemize}
+\item Device Drivers
+\begin{itemize}
+\item VGA Console --- driver for basic video. (on)
+\item Serial Console --- driver for serial port console. (on)
+\item VM Console --- driver for Kitten console on top of Palacios. If
+Kitten is run {\em as a guest}, and it has VM Console on, then it can output
+cleanly via the Palacios OS Console device (on).
+\item NE2K Device Driver --- driver for NE2K and RTL8139 network cards
+(off)
+\item VM Network Driver --- driver for Kitten network output using
+Palacios. If Kitten is run {\em as a guest}, and it has VM Network
+Driver, then it can send and receive packets using the Palacios Linux
+virtio network device. (off)
+\end{itemize}
+\item ISOIMAGE configuration: Kitten kernel arguments. Note that
+this is NOT for the guest image, but rather for the Kitten image. You
+can leave this alone. For Palacios operation, it's important that the
+option \verb.console=serial. appears. If the NE2K/RTL8139 driver
+should be used \verb.net=rtl8139. should appear.
+\item Kernel Hacking
+\begin{itemize}
+\item Kernel Debugging --- here you can turn on various Kitten
+Linux-like debugging features. Only a few are noted below:
+\begin{itemize}
+\item Compile the kernel with debug info --- if this is on, you will
+have debugging information compiled in (-g)
+\item KGDB --- if you have this enabled, you will be able to attach to
+the running kernel from the GDB debugger. This means you can also
+attach to Palacios, which is embedded. If you want to debug Palacios
+using KGDB, be sure to turn on debugging in Palacios as well.
+\end{itemize}
+\end{itemize}
+\item Include Linux compatability layer --- if this is on, you can
+selectively add Linux system calls and other functionality to Kitten.
+Kitten is able to run Linux ELF executables as user processes with
+this layer. (off)
+\end{itemize}
-Furthermore, because the VGA console is not being used the {\em Kernel
-Command Line Arguments} must be modified to remove the {\em VGA}
-device from the console list.
+The guest OS that is to be booted as a VM is included as a blob
+pointed to by ``Path to guest image''. The blob starts with an XML
+description of the guest, followed by other chunks of data used, for
+example, as the content of virtual hard drives or CD ROMs. Please see
+Section~\ref{sec:guestconfig} for basic information on how to use the
+guest builder to assemble a guest OS blob.
-The guest OS that is booted as a VM is included as an ISO image in raw
-binary format inside Kitten's {\em init\_task}. To change the guest
-ISO, you must change the makefile for the init\_task. This is located
-in {\em user/hello\_world/Makefile} and the syntax is well commented.
-On {\em newskysaw} a collection of guest ISO images are located in
-{\em /opt/vmm-tools/isos/}.
+By default, the init task that is executed after Kitten boots (located
+in user/hello\_world) does a number of Kitten tests. One of these is
+a test of the VMM API, which is implemented using Palacios. When this
+test is done, a VM is created, configured according to the XML, and
+the guest OS blob is launched in it.
\subsection{Compilation}
-After Kitten has been configured the compilation can be done. The
-general process is to compile a reference build of Kitten, followed by
-compiling Palacios support as a kernel module, and then doing a new
-full recompilation of Kitten.
-The specific compilation steps are run from the top level Kitten directory:
+After Kitten has been configured it can be compiled. This is done
+simply by executing
\begin{verbatim}
-make
-cd palacios
-make -C .. M=`pwd`
-cp built-in.o ../modules/palacios-mod.o
-cd ..
-make
make isoimage
\end{verbatim}
+This command will compile Kitten (with Palacios embedded in it) and
+the init task (which will contain the guest OS blob), and then
+assemble an ISO image file which can be used to boot a machine. The ISO
+image is located at {\em ./arch/x86\_64/boot/image.iso}.
+
+This image file can be used for booting a QEMU emulation environment,
+for booting a remote machine using PXE, or can be burned to CD/DVD for
+booting a machine physically.
+
+
+\section{Basic Guest Configuration}
+\label{sec:guestconfig}
+
+A simple guest is created when you build Palacios. To configure your
+own guest, you write an XML configuration file, which contains
+references to other files that contain data needed to instantiate
+stateful devices such as virtual hard drives and CD ROMs. You supply
+this information to a guest builder utility that assembles a guest
+image suitable for reference in the Kitten configuration, as described
+above.
+
+The guest builder utility is located in {\em
+palacios/utils/guest\_creator}. You will need to run \verb.make. in
+that directory to compile it, resulting in the executable named {\em
+build\_vm}\footnote{This executable will also be copied into the top-level
+Palacios directory}. Also located in that directory is an example
+configuration file, named {\em default.xml}. We typically use this
+file as a template. It is carefully commented. In summary, a
+configuration consists of
+\begin{itemize}
+\item Physical memory size of the guest
+\item Basic VMM settings, such as what form of virtual paging is to be
+used, the scheduler rate, whether services like telemetry are on, etc.
+\item A memory map that maps regions of the host physical address
+space to the guest physical address space. This can, for example,
+make a framebuffer visible in the guest.
+\item A list of the files that will be used in assembling the image.
+For example, the contents of a boot CD.
+\item A list of the devices that the guest will have, including
+configuration data for each device.
+\end{itemize}
+There are a few subtleties involved with devices. One is that some
+devices form attachment points for other devices. The PCI device is
+an example of this. Another is that each device needs to define how
+it is attached (e.g. direct (implicit), via a PCI bus, etc.)
+Finally, there may be multiple instances of devices. For example, a
+PCI passthrough device is instantiated for every underlying PCI device
+we want to make visible in the guest.
+
+The XML configuration format is carefully designed to be extensible.
+For example, new devices could use additional or new configuration
+options. The configuration parser in Palacios essentially ignores XML
+blocks it doesn't understand.
+
-This generates an ISO boot image containing Kitten, Palacios, and the
-guest that will be run as a VM. The ISO image is located at {\em
-./arch/x86\_64/boot/image.iso}.
+To build a guest, one runs
+\begin{verbatim}
+palacios/utils/guest_creator/build_vm myconfig.xml -o myimage.dat
+\end{verbatim}
+Here, {\em myimage.dat} is the guest image that can be given to
+Kitten.
+
+A common kind of guest used for testing is one that boots some form of
+bootable Linux distribution, or other a live OS distribution. These
+distributions are CD ROM images (ISOs). A range of them are available
+on {\em newskysaw} under {\em /opt/vmm-tools/isos}. We often use
+Puppy Linux ({\em puppy.iso}) or Finnix ({\em finnix.iso}), for
+example, but isos are also available for Windows of different flavors,
+DOS, GeekOS, and others. If you just want to use some guest ISO image
+like this, you can generally just copy the default XML file, and
+modify the
+\verb.filename=. attribute here:
+\begin{verbatim}
+ <files>
+ <!-- The file 'id' is used as a reference for
+ other configuration components -->
+ <file id="boot-cd" filename="/home/jarusl/image.iso" />
+ <!--<file id="harddisk" filename="firefox.img" />-->
+ </files>
+\end{verbatim}
+For careful, repeatable experimentation, it is often convenient to
+build your own simplified Linux guest image. It will boot {\em much}
+faster than a full blown distribution and you can readily set up an
+environment in which you can exert very tight control, being able to
+modify the Linux kernel, the included files (e.g., benchmarks), and
+other components very rapidly. To learn more about how to do this,
+please consult the separate document named {\em Building a bootable
+guest image for Palacios and Kitten}.
\section{Running Palacios/Kitten}
-Kitten and Palacios are capable of running under Qemu, which makes
-debugging much simpler.
+Kitten and Palacios are capable of running under QEMU, which makes
+debugging much simpler. QEMU is a user-level Linux or Windows program
+that emulates a PC machine.
-The basic form of the command to start the Qemu emulator is:
+The basic form of the command to start the QEMU emulator is:
\begin{verbatim}
/usr/local/qemu/bin/qemu-system-x86_64 -smp 1 -m 1024 \
-serial file:./serial.out \
< /dev/null
\end{verbatim}
-The command starts up a single processor emulated machine, with 1gig
-of RAM and a cdrom drive loaded with the Kitten ISO image. Furthermore
-all output to the serial port is written directly to a file called
-{\em serial.out}. This command can be copied into a shell script for easy access.
+The command starts up a single processor emulated machine, with 1GB of
+RAM and a CD-ROM drive loaded with the Kitten ISO image. All output
+to the serial port is written directly to a file called {\em
+ serial.out}. This command can be copied into a shell script for easy
+access.
+
+We can also run Palacios/Kitten on physical hardware. The slow way is
+to burn the Kitten ISO image onto a CD ROM and then boot the test
+machine with it. The much faster way is to set the test machine up to
+use the PXE network boot system (most modern BIOSes support this), and
+boot your Kitten image over the network. The debugging output will
+then appear on the actual serial port of the physical machine. The
+separate document {\em Booting Palacios/Kitten Over the Network Using
+PXE} explains how to set up PXE boot and serial. For the Northwestern
+environment, please talk to Jack, Peter, Lei, or Yuan if you need to
+be able to do this. Northwestern has a range of AMD and Intel boxes
+for testing, as do UNM and Sandia. A different form of network boot
+is used for Red Storm.
+
\section{Development Guidelines}
There are standard requirements we have for code entering the mainline.
-First and foremost, Palacios is designed to be OS indenpendent and
-support 32 and 64 bit architectures. This means that developers should
+First and foremost, Palacios is designed to be OS independent and
+support 32-bit and 64-bit architectures. This means that developers should
not include any external OS specific dependencies in any Palacios
-component. Also all changes need to be tested on both 32 and 64 bit
-architectures to make sure that they compile as well as run corrrectly.
+component. Also all changes need to be tested on both 32-bit and 64-bit
+architectures to make sure that they compile as well as run correctly.
\paragraph*{Coding Style}
-"The use of equal negative space, as a balance to positive space, in a
+``The use of equal negative space, as a balance to positive space, in a
composition is considered by many as good design. This basic and often
-overlooked principle of design gives the eye a "place to rest,"
-increasing the appeal of a composition through subtle means."
+overlooked principle of design gives the eye a 'place to rest,'
+increasing the appeal of a composition through subtle means.''
\newline\newline
-Translation: Use the spacebar, newlines, and parentheses.
+Translation: Use the space bar, newlines, and parentheses.
Curly-brackets are not optional, even for single line conditionals.
Tabs should be 4 characters in width.
-{\em Special:} If you are using XEmacs add the following to your \verb.init\.el. file:
+{\em Special:} If you are using XEmacs add the following to your \verb!init.el! file:
\begin{verbatim}
(setq c-basic-offset 4)
(c-set-offset 'case-label 4)
{\em Good}
-
\begin{verbatim}
if (((a) && (b == 5)) ||
(c != 0)) {
\paragraph*{Fail Stop}
-Because booting a basic linux kernel results in over 1 million VM exits
+Because booting a basic Linux kernel results in over 1 million VM exits
catching silent errors is next to impossible. For this reason
ANY time your code has an error it should return -1, and expect the
execution to halt.
header file. You should make an effort to use static functions
whenever possible.
-\item \verb.v3_. prefix
-\newline
-Major interface functions should be named with the prefix \verb.v3_. This
-xallows easy understanding of how to interact with the subsystems. And
-in the case that they need to be externally visible to the host os,
-make them unlikely to collide with other functions.
+\item \verb.v3_. prefix \newline Major interface functions should be
+ named with the prefix \verb.v3_. This allows easy understanding of
+ how to interact with the subsystems. In the case that they need to
+ be externally visible to the host OS, make them unlikely to collide
+ with other functions.
\end{enumerate}
\paragraph*{Debugging Output}
-Debugging output is sent through the host os via functions in the
+Debugging output is sent through the host OS via functions in the
\verb.os_hooks. structure. These functions have various wrappers of the form
-\verb.Print*., with printf style semantics.
+\verb.Print*., with \texttt{printf}-style semantics.
Two functions of note are \verb.PrintDebug. and \verb.PrintError..
Git includes support for directly exporting local repository commits
as a patch set. The basic operation is for a developer to commit a
change to a local repository, and then export that change as a patch
-that can be applied to another git repository. While this is
-functionally possible, there are a number of issues. The main problem
-is that it is difficult to fully encapsulate a new feature in a single
-commit, and dealing with multiple patches that often overwrite each
-other is not a viable option either. Furthermore, once a patch is
-applied to the mainline, it will generate a conflicting commit that
-will become present when the developer next pulls from the central
+that can be applied to another git repository. Patch generation is
+done with {\em git format-patch}. While this is functionally
+possible, there are a number of issues. The main problem is that it is
+difficult to fully encapsulate a new feature in a single commit, and
+dealing with multiple patches that often overwrite each other is not a
+viable option either. Furthermore, once a patch is applied to the
+mainline, it will generate a conflicting commit that will become
+present when the developer next pulls from the central
repository. This can result in both repositories getting out of
-sync. It is possible to deal with this by manually rebasing the local
-repository, but it is difficult and error-prone.
+sync. It is possible to deal with this by manually re-basing the local
+repository, but it is difficult and error-prone.
This approach also does not map well when patches are being revised. A
normal patch will go through multiple revisions as it is reviewed and
For this reason most internal developers should seriously consider
{\em stacked git}. Stacked git is designed to make patch development
easier and less of a headache. The basic mode of operation is for a
-developer to intialize a patch for a new feature and then continuously
-apply changes to the patch. Stacked Git allows a developer to layer a
-series of patches on top of a local git repository, without causing
-the repository to unsync due to local commits. Basically, the
-developer never commits changes to the repository itself but instead
-commits the changes to a specific patch. The local patches are managed
-using stack operations (push/pop) which allows a developer to apply
-and unapply patches as needed. Stacked git also manages new changes to
-the underlying git repository as a result of a pull operation and
-prevents collisions as changes are propagated upstream. For instance
-if you have a local patch that is applied to the mainline as a commit,
-when the commit is pulled down the patch becomes empty because it is
-effectively identical to the mainline. It also makes incorporating
-external revisions to a patch easier. Stacked git is installed on {\em
-newskysaw} in \verb./opt/vmm-tools/bin/.
+developer to initialize a patch for a new feature and then
+continuously apply changes to the patch. Stacked Git allows a
+developer to layer a series of patches on top of a local git
+repository, without causing the repository to unsync due to local
+commits. Basically, the developer never commits changes to the
+repository itself but instead commits the changes to a specific
+patch. The local patches are managed using stack operations (push/pop)
+which allows a developer to apply and unapply patches as
+needed. Stacked git also manages new changes to the underlying git
+repository as a result of a pull operation and prevents collisions as
+changes are propagated upstream. For instance if you have a local
+patch that is applied to the mainline as a commit, when the commit is
+pulled down the patch becomes empty because it is effectively
+identical to the mainline. It also makes incorporating external
+revisions to a patch easier. Stacked git is installed on {\em
+newskysaw} and {\em newbehemoth} in \verb./opt/vmm-tools/bin/.
Brief command overview:
\begin{itemize}
-\item \verb.stg init. -- Initialize stacked git in a given branch
-\item \verb.stg new. -- create a new patch set, an editor will open
+\item \verb.stg init. -- initialize stacked git in a given branch
+\item \verb.stg new. -- create a new patch set; an editor will open
asking for a commit message that will be used when the patch is
ultimately committed.
\item \verb.stg pop. -- pops a patch off of the source tree.
that can then be emailed.
\item \verb.stg refresh. -- commits local changes to the patch set at
the top of the applied stack.
-\item \verb.stg fold. -- Apply a patch file to the current
+\item \verb.stg fold. -- apply a patch file to the current
patch. (This is how you can manage revisions that are made by other developers).
\end{itemize}
-You should definately look at the online documentation to better
+You should definitely look at the online documentation to better
understand how stacked git works. It is not required of course, but if
-you want your changes to be applied its up to you to generate a patch
-that is acceptable to a core developer. Ultimately using Stacked git
+you want your changes to be applied it's up to you to generate a patch
+that is acceptable to a core developer. Ultimately, using Stacked git
should be easier than going it alone.
\end{verbatim}
-%Also, remember that Kitten is not a Northwestern project and is being
-%developed by professional developers at Sandia National Labs. So keep
-%in mind that you are representing Northwestern and the rest of the
-%Palacios development group. We are collaborating with them because
-%Kitten and the resources they have are very important for our research
-%efforts. They are collaborating with us because they believe that
-%Palacios might be able to help them. Therefore it is important that we
-%continue to ensure that they see value in our collaboration. In plain
-%terms, we need to make sure they think we're smart and know what we're
-%doing. So please keep that in mind when dealing with the Kitten group.
-
\section{Networking}
-\section{Configuring the development host's Qemu network}
-Set up Tap interfaces:
-
-/root/util/tap\_create tapX
-
-Bridging tapX with eth1 will only work (work = send packet and also
-make packet visible on localhost) if the IP address is set correctly
-(correctly = match network it is connected to e.g., network of eth1)
-so bring up the network inside of the VM / QEMU as 10-net, and it
-should route through the eth1 rule and be visible both on the host and
-in the physical network
-
+Both the Kitten and GeekOS substrates on which Palacios can run
+currently include drivers for two simple network cards, the NE2000,
+and the RTL8139. Palacios also supports passthrough I/O for PCI
+devices, meaning we can make NICs directly accessible by guests. The
+Kitten substrate is acquiring an ever increasing set of drivers for
+specialized network systems. A lightweight networking stack is
+included so that TCP/IP networking is possible from within the host OS
+kernel and in Palacios.
+
+When debugging Palacios on QEMU, it is very convenient to add an
+RTL8139 card to your QEMU configuration, and then drive it from within
+Palacios. QEMU can be configured to provide local connectivity to the
+QEMU emulated machine, including bridging the emulated machine with a
+physical network. Local connectivity can be done with redirection, or
+with a TAP interface. For global connectivity, a TAP interface must
+be used; it is bridged to a physical interface.
+
+\section{Configuring the development host's QEMU network}
+
+To get local connectivity with redirection, no networking changes on
+the host are needed. However, people usually want to use TAP-based
+networking, which does require changes. For one thing, TAP interfaces
+can be inspected with tools like wireshark, which makes for much
+easier debugging of network code.
+
+In order to get QEMU networking to function, it is necessary to create
+TAP interfaces, and, optionally, to bridge them to real networks. A
+development machine typically will have several TAP interfaces, and
+more can be created. Generally, each developer should have a TAP
+interface of his or her own. Here we use newskysaw as an example.
+
+To set up a TAP interface on newskysaw, the following command is used:
+\begin{verbatim}
+/root/util/tap_create tapX
+\end{verbatim}
-\subsection{Configuring Kitten}
+When QEMU runs with a tap interface, it will use /etc/qemu-ifup to
+bring up the interface. /etc/qemu-ifup looks like this:
-To enable networking in Qemu, networking needs to be enabled in the configuration.
+\begin{verbatim}
+#!/bin/bash
+echo "Executing /etc/qemu-ifup - no external bridging"
+echo "Bringing up $1 for bridged mode..."
+NET=`echo $1 | cut -dp -f2`
+sudo /sbin/ifconfig $1 172.2${NET}.0.1 up
+sleep 2
+\end{verbatim}
-Make sure turn on the network device driver, networking, and input
-kernel command 'console=serial net=rtl8139'
+The interface tap$N$ is brought up with the IP address 172.2$N$.0.1.
+ifconfig will also create a routing rule that sends 172.2$N$.0.1/16
+traffic to tap$N$. The upshot is that if the code running in QEMU
+uses an IP address in this network (for example: 172.2$N$.0.2), you
+will be able to talk to it from newskysaw. For example, from
+newskysaw, if you ping 172.21.0.2, the packet (and ARP) will go out via
+tap1. The source address will appear to be 172.21.0.1. The QEMU
+machine will see these packets on its interface, and the software
+controlling its interface can respond to 172.21.0.1.
+
+This form of networking is local to the machine. You can also bridge
+a TAP interface with a physical interface. The result of this is that
+a packet sent on it will be sent on the physical interface. To do
+this requires more effort (and is not set up by default on newskysaw).
+As an example, consider that on newskysaw, the physical interface eth1
+is connected to a private network switch to which the lab test
+computers (v-test-amd, v-test-amd2, etc.) are connected. To bridge,
+for example, tap10, to this interface, you would do the following
+(with root's help):
+\begin{enumerate}
+\item You need to bring up eth1 (ifconfig eth1 up {\em address}
+netmask {\em mask}). It is important that the address and mask you
+choose are appropriate for the network eth1 is connected to.
+\item You would bring up tap10 without an address: /sbin/ifconfig
+tap10 up
+\item You would bridge tap10 and eth1: /usr/sbin/brctl addif br0
+tap10; /usr/sbin/brctl addif eth1. This assumes that br0 was
+previously created.
+\end{enumerate}
-How to set ip address in kitten:
+Bridging tap$N$ with eth1 will only work (where ``work'' means sending
+a packet on the network and making the packet visible on localhost) if
+the IP address in the code running in QEMU is set correctly. This
+means that it needs to be set to correspond to the network of eth1).
+For the newskysaw configuration, this is a 10-net address.
-Kitten ip address setting is in file drivers/net/ne2k/rtl8139.c, in the code below which is located in function rtl8139\_init.
- struct ip\_addr ipaddr = { htonl(0 | 10 << 24 | 0 << 16 | 2 << 8 | 16 << 0) };
- struct ip\_addr netmask = { htonl(0xffffff00) };
- struct ip\_addr gw = { htonl(0 | 10 << 24 | 0 << 16 | 2 << 8 | 2 << 0) };
+\subsection{Configuring Kitten}
-This sets the ip address as 10.0.2.16, netmask 255.255.255.0 and gateway address 10.0.2.2, change it as you need.
+Kitten needs to be explicitly configured to use networking. Currently
+only a subset of the networking configurations are supported. To
+enable an Ethernet network you should enable the following options:
+\begin{itemize}
+\item Enable TCP Support
+\item Enable UDP Support
+\item Enable socket API
+\item Enable ARP support
+\end{itemize}
+The other options are not supported, and enabling them will probably
+break the kernel compilation.
-\subsection{Running with networking}
+To allow Kitten to communicate with the QEMU network card you also
+need to enable the appropriate device driver: \newline
+\verb.NE2K Device Driver (rtl8139).
-\paragraph*{Tap Interface}
-In which, the command line:
+The driver then needs to be listed as a Kernel Command Line argument
+in the {\em ISOIMAGE configuration}. To do this add
+\verb.net=rtl819. to the end of the argument string.
--net tap, ifname=tap2
+Kitten currently does not support the dynamic assignment or IP
+addresses at runtime. Because of this it is necessary to hardcode the
+IP address into the device driver. For the rtl8139 network driver look
+in the file {\em drivers/net/ne2k/rtl8139.c} for the function
+\verb.rtl8139_init..
-specifies Qemu to use the host's tap0 as its network interface, then Qemu can access the host's physical network.
+There should be a block of code that looks like the following:
+\begin{verbatim}
+ struct ip_addr ipaddr = { htonl(0 | 10 << 24 | 0 << 16 | 2 << 8 | 16 << 0) };
+ struct ip_addr netmask = { htonl(0xffffff00) };
+ struct ip_addr gw = { htonl(0 | 10 << 24 | 0 << 16 | 2 << 8 | 2 << 0) };
+\end{verbatim}
-\paragraph*{Redirection}
+This sets the IP address as 10.0.2.16, netmask 255.255.255.0 and
+gateway address 10.0.2.2. Change these assignments to match your configuration.
-Also you can use the following command instead to redirect host's 9555 port to Qemu's 80 port.
--net user -net nic,model=rtl8139 -redir tcp:9555::80
+\paragraph*{Kitten as the Guest OS}
-In this case, you can access Qemu's 80 port in the host like:
+When running Kitten as a VM, the above applies except that you will
+want to enable the {\em VMNET} device driver instead of the {\em rtl8139}.
-telnet localhost 9555
-Qemu has many options to build up a virtual or real networking. See http://www.h7.dion.ne.jp/~qemu-win/HowToNetwork-en.html for more information.
+\subsection{Running with networking}
+\paragraph*{TAP Interface}
+Running with a TAP interface provides either local or global
+connectivity (depending on how the TAP interface is configured and/or
+bridged). From the perspective of the QEMU command line, both look
+the same, however. You simply add something like this to the command
+line:
+\begin{verbatim}
+-net tap,ifname=tap2 -net nic,model=rtl8139
+\end{verbatim}
+The first \verb.-net. option indicates that you want to use a tap
+interface, specifically \verb.tap2.. The second \verb.-net. option
+specifies that this interface will appear to code in the QEMU machine
+to be a network interface card of the specific model RTL8139. Note
+that this is a model for which we have a driver. If tap2 were
+bridged, we'd get global connectivity. If not, we would just get
+local connectivity.
+\paragraph*{Redirection}
+It is also possible to achieve limited local connectivity even if you
+have no TAP support on your development machine. In redirection, QEMU
+essentially acts as a proxy, translating TCP or other connections and
+low-level packet operations on the network interface in the QEMU
+machine. For example, the following options will redirect the host's
+9555 port to the QEMU machine's 80 port:
+\begin{verbatim}
+-net user -net nic,model=rtl8139 -redir tcp:9555:10.10.10.33:80
+\end{verbatim}
+The first \verb.-net. option indicates that we are using user-level
+networking (proxying). The second \verb.-net. option indicates that
+this user-level network will appear in the QEMU machine as an RTL8139
+network card. The \verb.-redir. option indicates that connections on
+localhost:9555 will be translated into equivalent packet exchanges on
+the RTL8139 card in the QEMU machine. However, we have to tell QEMU
+which IP address and port to use on the QEMU machine's side. This is
+what the 10.10.10.33 address, and port 80 are. In the example, if you
+access port 9555 on localhost, say with:
+\begin{verbatim}
+telnet localhost 9555
+\end{verbatim}
+The packets that appear in the QEMU machine will be bound for
+10.10.10.33, port 80. Within the QEMU machine, your RTL8139 interface
+had better then be up on that address.
+QEMU has many options to build up virtual or real networking. See
+http://www.h7.dion.ne.jp/$\sim$qemu-win/HowToNetwork-en.html for more
+information.
-For more questions, talk to Jack or Lei.
+For more questions, talk to Jack, Lei, or Peter.
\end{document}