In previous classes we've covered a lot ground related to the high-level runtime including the operational pipeline and how tasks work. A lot of the interesting things that happen in Legion are part of the region tree, which we will start to look at today. The region tree forest is a data structure that tracks all of the region data in an Legion program.

About Legion Runtime Class

These notes are closely based on the set of Legion Runtime Class videos produced by the Legion developers. They are my own notes and code walks, and any errors or things that are just plain wrong represent my own mistakes.

Today's notes are based on the following video:

Region Tree Forest

All of the region tree bits are located in region_tree.cc and region_tree.h. The RegionTreeForest is defined in region_tree.h, and provides an interface between the rest of Legion and the region data structures.

    class RegionTreeForest {
      RegionTreeForest(Runtime *rt);

The forest structure is effectively a singleton as there is exactly one of the forests created for each instance of the high-level runtime (which is also a singleton):

    class Runtime {
      RegionTreeForest *const forest;

The forest field is set when the runtime is created:

    Runtime::Runtime(Machine m, AddressSpaceID unique,
                     const std::set<Processor> &locals,
                     const std::set<Processor> &local_utilities,
                     const std::set<AddressSpaceID> &address_spaces,
                     const std::map<Processor,AddressSpaceID> &processor_spaces,
                     Processor cleanup, Processor gc, Processor message)
      : high_level(new HighLevelRuntime(this)), machine(m), 
        address_space(unique), runtime_stride(address_spaces.size()),
        forest(new RegionTreeForest(this)), outstanding_top_level_tasks(1),

The interface of the RegionTreeForest is divided by category. For instance there are a set of interfaces for dealing with index partitions and index spaces:

    class RegionTreeForest {
      IndexPartition get_index_partition(IndexSpace parent, Color color);
      IndexSpace get_index_subspace(IndexPartition parent, Color color);
      bool has_multiple_domains(IndexSpace handle);
      Domain get_index_space_domain(IndexSpace handle);

As well as fields, logical regions, and logical and physical dependence analysis, among other things:

      void create_field_space(FieldSpace handle);
      void destroy_field_space(FieldSpace handle, AddressSpaceID source);
      bool allocate_field(FieldSpace handle, size_t field_size, 
                          FieldID fid, bool local);
      LogicalPartition get_logical_partition(LogicalRegion parent, 
                                             IndexPartition handle);
      LogicalPartition get_logical_partition_by_color(
                                  LogicalRegion parent, Color color);
      // Logical analysis methods
      void perform_dependence_analysis(RegionTreeContext ctx, 
                                       Operation *op, unsigned idx,
                                       RegionRequirement &req,
                                       RegionTreePath &path);
      void perform_fence_analysis(RegionTreeContext ctx, Operation *fence,
      // Physical analysis methods
      bool premap_physical_region(RegionTreeContext ctx,
                                  RegionTreePath &path,
                                  RegionRequirement &req,
                                  Mappable *mappable,
                                  SingleTask *parent_ctx,
                                  Processor local_proc

Tree Mutation

Let's create an index space and see what happens. We'll call the version that creates an unstructured index space. First we get a handle for the index space, and then we immediately ask the forest to create the space for us.

    IndexSpace Runtime::create_index_space(Context ctx, size_t max_num_elmts)
      IndexSpace space = IndexSpace::create_index_space(max_num_elmts);
      return space;

Next the forest creates a node to represent the root (no parent) of a new index tree. The create_node method is overloaded based on the type of nodes we are creating.

    void RegionTreeForest::create_index_space(const Domain &domain) 
      create_node(domain, NULL/*parent*/, 0/*color*/);

There are several different versions of create_node defined:

      IndexSpaceNode* create_node(Domain d, IndexPartNode *par, Color c);
      IndexPartNode*  create_node(IndexPartition p, IndexSpaceNode *par,
                                 Color c, Domain color_space, bool disjoint);
      FieldSpaceNode* create_node(FieldSpace space);
      RegionNode*     create_node(LogicalRegion r, PartitionNode *par);
      PartitionNode*  create_node(LogicalPartition p, RegionNode *par);

We'll focus here on the one that creates an IndexSpaceNode. First we create a new node, and the grab the lookup lock which protects the index_nodes structure. This structure is simply a map from index spaces to index space nodes in the forest. With tasks and data moving around in a system its possible for two requests to create the same index space occur. In this case we delete the node we created and return the one already in the tree. When there is a parent, we record that too.

    IndexSpaceNode* RegionTreeForest::create_node(Domain d, 
                                                  IndexPartNode *parent,
                                                  Color c)
      IndexSpaceNode *result = new IndexSpaceNode(d, parent, c, this);
      IndexSpace sp = d.get_index_space();
      // Check to see if someone else has already made it
        // Hold the lookup lock while modifying the lookup table
        AutoLock l_lock(lookup_lock);
        std::map<IndexSpace,IndexSpaceNode*>::const_iterator it =
        if (it != index_nodes.end())
          delete result;
          return it->second;
        index_nodes[sp] = result;
      if (parent != NULL)

      return result;

Note that unlike other operations we have seen that are deferred, these modifications are done immediately. The reason is that Legion applications frequently create an index space and then immediately start partitioning it, or doing other things. So delaying their creation would only add latency in most cases.

Index Partition

Creating an index partition is more complex. The method takes the index space that is being partitioned (the parent), a coloring that maps colors to points, and a flag indicating if the partitioning is disjoint. First we get a unique identifier for this partition.

    IndexPartition Runtime::create_index_partition(
                                          Context ctx, IndexSpace parent,
                                          const Coloring &coloring,
                                          bool disjoint,
                                          int part_color)
      IndexPartition pid = get_unique_partition_id();

Next we toss an error if the coloring is empty!

      if (coloring.empty())
        log_run(LEVEL_ERROR,"Attempt to create index partition with no "
                            "colors in task %s (ID %lld). Index partitions "
                            "must have at least one color.",
                            ctx->variants->name, ctx->get_unique_task_id());

Then we get to the good stuff. First we create a structure to represent the range of the color space. At a high-level the remainder of this function creates a new index space for each partition, and then creates a new index space partition with all the newly created index spaces as children.

The for loop that iterates over each color creates a new ElementMask for each color. An element mask is a data structure that is efficient in marking which pointers in a space have been allocated.

      Point<1> lower_bound(coloring.begin()->first);
      Point<1> upper_bound(coloring.rbegin()->first);
      Rect<1> color_range(lower_bound,upper_bound);
      Domain color_space = Domain::from_rect<1>(color_range);
      // Perform the coloring by iterating over all the colors in the
      // range.  For unspecified colors there is nothing wrong with
      // making empty index spaces.  We do this so we can save the
      // color space as a dense 1D domain.
      std::map<Color,Domain> new_index_spaces; 
      for (GenericPointInRectIterator<1> pir(color_range); pir; pir++)
        Color c = pir.p;
        std::map<Color,ColoredPoints<ptr_t> >::const_iterator finder = 
        // If we had a coloring provided, then fill in all the elements
        if (finder != coloring.end())
          const ColoredPoints<ptr_t> &pcoloring = finder->second;
          for (std::set<ptr_t>::const_iterator it = pcoloring.points.begin();
                it != pcoloring.points.end(); it++)
          for (std::set<std::pair<ptr_t,ptr_t> >::const_iterator it = 
                pcoloring.ranges.begin(); it != pcoloring.ranges.end(); it++)
            child_mask.enable(it->first.value, it->second-it->first+1);

At the end of each loop iteration we create the new index space and record it.

        // Now make the index space and save the information
        IndexSpace child_space = 
          IndexSpace::create_index_space(parent, child_mask);
        new_index_spaces[finder->first] = Domain(child_space);

And finally we create the index partition with the new index spaces as children:

      forest->create_index_partition(pid, parent, disjoint,
                                 part_color, new_index_spaces, color_space);

      return pid;

The forest then does its magic. It looks up the index space corresponding to the parent. The next bit of code is to generate a color if one wasn't provided (ie the application doesn't care what color), then the partition node is created, and finally, a new index space node is created for each color.

    void RegionTreeForest::create_index_partition(IndexPartition pid,
        IndexSpace parent, bool disjoint, 
        int color, const std::map<Color,Domain> &coloring, Domain color_space)
      IndexSpaceNode *parent_node = get_node(parent);
      Color part_color;
      if (color < 0)
        part_color = parent_node->generate_color();
        part_color = unsigned(color);
      IndexPartNode *new_part = create_node(pid, parent_node, part_color,
                                    color_space, disjoint);
      // Now do all the child nodes
      for (std::map<Color,Domain>::const_iterator it = coloring.begin();
            it != coloring.end(); it++)
        if (it->first == UINT_MAX)
          log_index(LEVEL_ERROR,"Invalid child color UINT_MAX specified "
                                "for create index partition.  All colors "
                                "must be between 0 and UINT_MAX-1");
        Domain domain = it->second;
        domain.get_index_space(true/*create if necessary*/);
        create_node(domain, new_part, it->first);


Applications may want to query information about the forest. For instance, what color corresponds to a particular index space?

    Color Runtime::get_index_space_color(Context ctx, IndexSpace handle)
      return forest->get_index_space_color(handle);

Look up the node in the forest:

    Color RegionTreeForest::get_index_space_color(IndexSpace handle)
      IndexSpaceNode *node = get_node(handle);
      return node->color;

Note that its a major problem if an application asks for an index space that doesn't exist:

    IndexSpaceNode* RegionTreeForest::get_node(IndexSpace space)
      AutoLock l_lock(lookup_lock,1,false/*exclusive*/); 
      std::map<IndexSpace,IndexSpaceNode*>::const_iterator it = 
      if (it == index_nodes.end())
        log_index(LEVEL_ERROR,"Unable to find entry for index space " IDFMT "."
                              "This is either a runtime bug, or requires "
                              "Legion fences if index space names are being "
                              "returned out of the context in which they are "
      return it->second;

Then we have a handle from get_node to one of the structures in the forest. For instance in this case the index space node structure contains lots of methods specific to this node type:

    class IndexSpaceNode : public IndexTreeNode {
      IndexSpaceNode(Domain d, IndexPartNode *par, Color c,
                     RegionTreeForest *ctx);
      IndexSpaceNode(const IndexSpaceNode &rhs);

      bool has_child(Color c);
      IndexPartNode* get_child(Color c);
      void add_child(IndexPartNode *child);

      bool are_disjoint(Color c1, Color c2);
      void add_disjoint(Color c1, Color c2);
      Color generate_color(void);
      void get_colors(std::set<Color> &colors);

Likewise for the index partition node that we had shown previously:

    class IndexPartNode : public IndexTreeNode { 
      IndexPartNode(IndexPartition p, IndexSpaceNode *par,
                    Color c, Domain color_space, bool dis,
                    RegionTreeForest *ctx);
      IndexPartNode(const IndexPartNode &rhs);
      virtual ~IndexPartNode(void);

But both types inherit from IndexTreeNode, and that is where the color field is located:

    class IndexTreeNode {

      const unsigned depth;
      const Color color;
      RegionTreeForest *const context;


Region Trees vs Index Space Trees

Recall that index space nodes in the forest are created eagerly. That policy is also true for field space, but the property doesn't hold for logical regions, because they are more complex. Instead nodes in a logical region tree are created lazily as an optimization that not all nodes in a system need to see all of the region tree, which can be quite large.

We can see some of the differences by looking at RegionTreeForest::get_node for the LogicalRegion node type. Note that we also grab the lookup lock, but if the node isn't found we don't throw an error, and instead we create a new node. And notice that we also recursively call get_node to acquire parent nodes. In this way only the path in the tree that is needed for a particular request is instantiated.

    RegionNode* RegionTreeForest::get_node(LogicalRegion handle)
      // Check to see if the node already exists
        AutoLock l_lock(lookup_lock,1,false/*exclusive*/);
        std::map<LogicalRegion,RegionNode*>::const_iterator it = 
        if (it != region_nodes.end())
          return it->second;
      // Otherwise it hasn't been made yet, so make it
      IndexSpaceNode *index_node = get_node(handle.index_space);

      LogicalPartition parent_handle(handle.tree_id, index_node->parent->handle,
      // Note this request can recursively build more nodes, but we
      // are guaranteed that the top level node exists
      PartitionNode *parent = get_node(parent_handle);
      // Now make our node and then return it
      return create_node(handle, parent);