Step-by-step Guide to Adding a New Dialect in MLIR

For one of my projects, I needed to add a new dialect to the main MLIR tree. However, following the information available, I encountered some issues. I made a âcleanâ example dialect, which I was able to add correctly. This post discusses how this is achieved, and links to some code.
The information in this post was sourced partially from
Chapter 2 of the Toy tutorial
and
Creating a Dialect tutorial.
I hope to update the latter with some of the steps described below. Note that
MLIR/LLVM often has API breaking changes, and this guide may not be entirely
correct or best practice when reading. My code builds on 42204c9.
If you just want to see the code/diff for a complete working example,
checkout commit 7c89cfe on the new_dialect branch on GitHub.
Overall, to add the new dialect I changed three files and created six new ones.
If youâre unfamiliar with MLIR, I recommend you check out the
docs, and
the full Toy tutorial is worth
doing too.
include
The first thing we need to do is decide how we want to define our dialect. MLIR allows us to define the dialect using TableGen, which automatically generates a lot of the boilerplate required, as well as reducing the costs of maintenance if an API breaking change occurs. We could also write the C++ ourselves, but for many dialects this is overkill.
Step 1: Letâs create a directory mlir/include/mlir/Dialect/Foo/, where we
will store our dialect definitions. Make sure to add_subdirectory(Foo) in the
CMakeLists.txt of mlir/include/mlir/Dialect.
Step 2: Next, weâre going to define the basic definition of our dialect,
mlir/include/mlir/Dialect/Foo/FooBase.td. Hereâs weâll give our dialect a
name, the C++ namespace that it will use, and a description:
#ifndef FOO_BASE
#define FOO_BASE
include "mlir/IR/OpBase.td"
def Foo_Dialect : Dialect {
let name = "foo";
let cppNamespace = "::mlir::foo";
let description = [{
Lorem Ipsum
}];
}
#endif // FOO_BASE
Step 3: Letâs also create a mostly blank file FooOps.td. This is where we
would include the definition of the operations of our dialect, if we had any.
For now, letâs just put some simple includes:
#ifndef FOO_OPS
#define FOO_OPS
include "mlir/Dialect/Foo/FooBase.td"
include "mlir/Interfaces/InferTypeOpInterface.td"
include "mlir/Interfaces/VectorInterfaces.td"
include "mlir/Interfaces/SideEffectInterfaces.td"
#endif // FOO_OPS
These two files will generate some C++ files that can be included elsewhere in
the project. For example, in our build directory (once weâve set up the rest of
our code), we will generate the file
./tools/mlir/include/mlir/Dialect/Foo/FooOps.h.inc. This will look something
like this, actually defining the C++ class of our dialect.
/*===- TableGen'erated file -------------------------------------*- C++ -*-===*\
|* *|
|* Dialect Declarations *|
|* *|
|* Automatically generated file, do not edit! *|
|* From: FooOps.td *|
|* *|
\*===----------------------------------------------------------------------===*/
namespace mlir {
namespace foo {
class FooDialect : public ::mlir::Dialect {
explicit FooDialect(::mlir::MLIRContext *context);
void initialize();
friend class ::mlir::MLIRContext;
public:
~FooDialect() override;
static constexpr ::llvm::StringLiteral getDialectNamespace() {
return ::llvm::StringLiteral("foo");
}
};
} // namespace foo
} // namespace mlir
MLIR_DECLARE_EXPLICIT_TYPE_ID(::mlir::foo::FooDialect)
Note that the above is automatically generated, and you should only edit the TableGen files to create it. You can extend the dialect with C++ later if you want, or for some advanced cases you may need to define your dialect in C++ from the start.
Step 4: I also defined a file Foo.h, which we can use to include our
dialect elsewhere, avoiding the ugliness of .inc files. This looks like:
#ifndef MLIR_DIALECT_FOO_H_
#define MLIR_DIALECT_FOO_H_
#include "mlir/Bytecode/BytecodeOpInterface.h"
#include "mlir/IR/BuiltinTypes.h"
#include "mlir/IR/Dialect.h"
#include "mlir/IR/OpDefinition.h"
#include "mlir/IR/OpImplementation.h"
#include "mlir/Interfaces/InferTypeOpInterface.h"
#include "mlir/Interfaces/SideEffectInterfaces.h"
#include "mlir/Interfaces/VectorInterfaces.h"
//===----------------------------------------------------------------------===//
// Foo Dialect
//===----------------------------------------------------------------------===//
#include "mlir/Dialect/Foo/FooOpsDialect.h.inc"
//===----------------------------------------------------------------------===//
// Foo Dialect Operations
//===----------------------------------------------------------------------===//
#define GET_OP_CLASSES
#include "mlir/Dialect/Foo/FooOps.h.inc"
#endif // MLIR_DIALECT_FOO_H_
Step 5: Finally, letâs create the CMakeLists.txt file in the Foo include
directory:
add_mlir_dialect(FooOps foo)
add_mlir_doc(FooOps FooOps Dialects/ -gen-dialect-doc -dialect foo)
This ensures our TableGen is executed properly.
Step 6: Finally, an optional step is to ensure that our dialect is
registered globally, otherwise we will need to add it to the registry of
whatever tool we need it for manually. If you open the file
mlir/include/mlir/InitAllDialects.h, you will see where this is done. Add the
lines #include "mlir/Dialect/Foo/Foo.h", and foo::FooDialect, to the
registry.insert call, and once weâre finished the dialect should be globally
available. You can put a registry.insert line for your dialect in the
executable you care about if you donât want it registered globally.
Source code
There isnât much regarding âimplementationâ for our dialect, since we donât actually have any operations or transformations yet. However to get our minimum working dialect, we do require a little bit of code.
Step 7: First, letâs create a Foo directory in mlir/lib/Dialect/Foo/. Be
sure to add add_subdirectory(Foo) to the CMakeLists.txt of the parent
directory. Next, letâs create a file FooDialect.cpp. This will use some of
auto-generated implementation boilerplate from the previous steps, see the
#include statements.
#include "mlir/Dialect/Foo/Foo.h"
using namespace mlir;
using namespace mlir::foo;
#include "mlir/Dialect/Foo/FooOpsDialect.cpp.inc"
void mlir::foo::FooDialect::initialize() {
addOperations<
#define GET_OP_LIST
#include "mlir/Dialect/Foo/FooOps.cpp.inc"
>();
}
Step 8: Finally, letâs create our CMakeLists.txt. This will create the
dialect library, and allow us to link against other executables. It should also
make the library available under the CMake variable dialect_libs, which is
used in the compilation of tools such as mlir-opt. Thus you wonât need to do
any manual linking to get that working.
add_mlir_dialect_library(MLIRFooDialect
FooDialect.cpp
ADDITIONAL_HEADER_DIRS
${MLIR_MAIN_INCLUDE_DIR}/mlir/Dialect/Foo
DEPENDS
MLIRFooOpsIncGen
LINK_LIBS PUBLIC
MLIRDialect
MLIRIR
MLIRUBDialect
)
Verification
Great, we now have everything we need to compile, creating our new dialect
Foo, and registering it in the main MLIR dialect registry. Go ahead and build.
Now, to verify that our dialect was added correctly, we can run mlir-opt. Pass
the --show-dialects and it will give a list of loaded dialects. You should see
foo amongst them.
And thatâs us done. You can extend this example to make a more fully featured dialect. Now you have a working dialect, now might be the time to revisit the dialect definition tutorial.
Bonus! Adding an operation
Clearly, the next step to creating a dialect is to start adding operations to it! You can see in the Toy tutorial how we can do that, but what does this look like in our stripped down Foo dialect?
We defined an initial FooOps.td ODS file above, but we didnât actually include
any operations.
Letâs update this file:
include "mlir/Dialect/Foo/FooBase.td"
include "mlir/Interfaces/FunctionInterfaces.td"
include "mlir/IR/SymbolInterfaces.td"
include "mlir/Interfaces/SideEffectInterfaces.td"
class Foo_Op<string mnemonic, list<Trait> traits = []> :
Op<Foo_Dialect, mnemonic, traits>;
def BarOp : Foo_Op<"bar"> {
let summary = "bar operation";
}
We define a high-level Foo_Op, which all of the operations in our Foo dialect
are derived from. Then, we have our operation, which we will call bar. Right
now it takes no arguments and returns nothing, and we have our definition under
BarOp. Much like before, TableGen will create the necessary header files and
implementations for our operation.
The other thing we need to add is the appropriate inclusion of our op classes.
Add the following to the end of our FooDialect.cpp file:
#define GET_OP_CLASSES
#include "mlir/Dialect/Foo/FooOps.cpp.inc"
If you donât include the FooOps.cpp.inc file with GET_OP_CLASSES, then you
may encounter compile errors such as:
ld.lld: error: undefined symbol: mlir::detail::TypeIDResolver<mlir::foo::BarOp, void>::id
ld.lld: error: undefined symbol: mlir::foo::BarOp::verifyInvariantsImpl()
Essentially we canât find all of the definitions that TableGen creates for our
op, such as TypeIDResolver and verifyInvariantsImpl.
You can see the changes on commit
0cd014c.