About Legion Runtime Class
These notes are closely based on the set of Legion Runtime Class videos produced by the Legion developers. They are my own notes and code walks, and any errors or things that are just plain wrong represent my own mistakes.
Today's notes are based on the following video:
Legion is separated into two main components: the high-level runtime (hlr) and the low-level runtime (llr). While the llr (Realm) is concerned with primitives that support execution of operations on a distributed machine, the hlr focuses on the secret Legion sauce.
From a code POV,
lowlevel.h is the boundary between the HLR and LLR. The naming convention of files in the Legion repository generally follows the rule that files associated with LLR stuff contain the word
lowlevel in the filename, with the exception being
activemsg. When reading code, C++ namespaces are used to separate the hlr (LegionRuntime::HighLevel) from the llr (LegionRuntime::LowLevel).
There are currently two implementation of the llr. The first is the high-performance implementation (
lowlevel.cc) that uses GASNet and supports GPUs and other HPC relevant things. The second is an implementation designed for use on a single node (
shared_lowlevel.cc). These two implementations may be unified at some point.
In contrast to a typical MPI application, Legion prefers to have a single instance of the run-time running per node (as opposed to per process) which manages all of the node resources. Are there variations and exceptions?
Low-level Runtime Startup
A Legion application is typically started using a launcher (e.g.
mpirun or GASNet launchers) that runs a copy of the application on every node in a system. When a Legion process is launched it will first configure the runtime (e.g. registering tasks) and then it will start the runtime:
HighLevelRuntime::set_top_level_task_id is located in
legion.cc and is only a thin wrapper around
This level of indirection is primarily a result of several iterations of the APIs, and may be simplified in the future. A full discussion about the details of things like task registration and the associated data structures will be part of a future class on the llr? Note also that while task registration here is static, dynamic task registration will be supported in the future.
Once we've configured the runtime we start it. There are some static asserts at the beginning, and then the first thing we do is created a Machine instance.
A Machine instance contains lots of information about the (distributed) machine we are running on, not just the local node. Recall that the startup process being described is a result of the Legion process running on all nodes in the system, and as such, each Legion process will eventually have a copy of the Machine instance. That is, each Legion process knows about the structure of the entire system. We'll see how that information is obtained shortly.
The next thing that occurs during
Runtime::start is that numerous high-level runtime options are parsed from the command line:
Once this is complete we kick the machine to start running:
background is true we return to the caller, otherwise the
Machine::run will take care of process termination. Returning to the caller is useful if there are other things that need to be done after starting Legion, such as managing threads or doing some MPI stuff.
LLR Bootstrap / Machine Startup
During runtime startup we created a Machine instance which stores all the details of the distributed machine. Two things are going on with Machine. During initialization we do all the discovery stuff. Then the runtime calls
Machine::run which starts the ball rolling.
Discovery of machine details happen in the Machine constructor:
the_machine can be considered a singleton. Next we stash away some of the parameters:
In this case we are recording a mapping between reduction id and reduction implementation. We do the same thing for the set of tasks. Next up we parse all the low-level configuration flags from the command line:
The next big thing that happens is that GASNet is initialized. In the high-performance llr implementation GASNet services as the fabric for inter-process messaging and memory exchange.
Then we setup all of the GASNet active message handlers for the various types of messages that will be sent around between processes. There are a bunch of different message types:
There will be more information on the low-level runtime such as the active messages in a later discussion. Next we setup the actual low-level runtime. In this way, the Machine object effectively bootstraps the low-level runtime:
Next we create this thing called
NodeAnnounceData announce_data; that will get filled up with the details of our local hardware like the different memories, processors, and GPUs. After we've filled in this data, we register the information with ourselves. Then we simultaneously broadcast our information to every other node as well as incorporate the announcement data from other nodes into our own view of the machine.
First, handle our own specs:
And finally we wait for announcements from all other nodes in the system before proceeding. Note that during the handling of an announcement from a peer node, the details of that node are incorporating into our local view, so there is no more handling of that data explicitly during machine initialization.
Those are the major points discussed regarding Machine setup. More details will be discussed in a later class. Next we'll go over
Machine::run which is the second phase of low-level runtime startup.
background parameter. If this is true then the first thing this method does is kick off a thread which will re-run this method immediately with
background = false and then return. Next is the construction of worker threads for the various processors in the system. At this point things are rolling. So we don't do anything except wait for the "job" to complete:
This loops around until there is nothing left to do. After that is just some clean-up. So, the natural question here is where did control go? How does the main task eventual start running? This is the domain of the high-level runtime bootstrapping process.
Recall that during
Runtime::start the task table is passed to the constructor of the Machine instance that is created:
Runtime::get_task_table is passed
true. If we look at this method:
We see that when the task table is retrieved when creating the Machine instance that
Runtime::register_runtime_tasks is called as a side effect. Digging a little further we see that some specific tasks are added to the task table:
In particular, the special task with ID
INIT_FUNC_ID is what is used to initialize the runtime. When a processor starts up it looks for such a task. In the following,
INIT_FUNC_ID is equivalent to
So in summary a processor gets spun up and first thing it does is looks for an initialization task to run. In this case it is
Runtime::initialize_runtime. This routine eventually calls
local_rt->launch_top_level_task before returning.
As part of the setup done in
Runtime::initialize_runtime there is a Runtime instance created, referred to as
local_rt, and then this is associated with each processor. It isn't clear what this is for, although it seems like it is the HLR, where as the Runtime instance created during Machine setup is the LLR. I think this will be discussed further in a future class.
We've finally reached
Runtime::launch_top_level_task(Processor proc). This method does some basic context setup, and finally adds the task (remember
legion_main_id ?) to the ready queue by calling
add_to_ready_queue(proc, top_task, false/*prev failure*/); which then treats the top level task like any other task. Note that this method is only called once in the system. The uniqueness of the caller is determined by the check:
Phew. And that's it: the application is running.