Build pipeline

This document describes the Build pipeline in Stride, its current implementation (and legacy), and the work that should be done to improve it.

Terminology

An Asset is a design-time object containing information to generate Content that can be loaded at runtime. For example, a Model asset contains the path to a source FBX file, and additional information such as an offset for the pivot point of the model, a scale factor, a list of materials to use for this model. A Sprite font asset contains a path to a source font, multiple parameters such as the size, kerning, etc. and information describing in which form it should be compiled (such as pre-rasterized, or using distance field...). Asset are serialized on disk using the YAML format, and are part of the data that a team developing a game should be sharing on a source control system.
Content is the name given to compiled data (usually generated from Assets) that can be loaded at runtime. This means that in term of format, Content is optimized for performance and size (using binary serialization, and data structured in a way so that the runtime can consume it without re-transforming it). Therefore Content is the platform-specific optimized version of your game data.

Design

Stride uses Content-addressable storage to store the data generated by the compilation. The main concept is that the actual name of each generated file is the hash of the file. So if, after a change, the resulting content built from the asset is different, then the file name will be different. An index map file contains the mapping between the content URL and the actual hash of the corresponding file. Parameters of each compilation commands are also hashed and stored in this database, so if a command is ran again with the same parameters, the build engine can easily recover the hashes of the corresponding generated files.

Build Engine

The build engine is the part of the infrastructure that transforms data from the assets into actual content and save it to the database. It was originally designed to build content from input similar to a makefile. (eg. "compile all files in MyModels/*.fbx into Stride models). It has then been changed to work with individual assets when the asset layer has been implemented. Due to this legacy, this library is still not perfectly suited or optimal to build assets in an efficient way (dependencies of build steps, management of a queue for live-compiling in the Game Studio, etc.).

Builder

The Builder class is the entry point of the build engine. A Builder will spawn a given number of threads, each one running a Microthread scheduler (see RunUntilEnd method).

Build Steps

The Builder takes a root BuildStep as input. We currently have two types of BuildSteps:

A ListBuildStep contains a sequence of BuildStep (Formerly we had an additional parent class called EnumerableBuildStep, but it has been merged into ListBuildStep). A ListBuildStep will schedule all the build steps it contains at the same time, to be run in parallel. Formerly we had a synchronization mechanism using a special WaitBuildStep but it has been removed. We now use PrerequisiteSteps with LinkBuildSteps to manage dependencies.
A CommandBuildStep contains a single Command to run, which does actual work to compile asset.

TODO: Currently, when compiling a graph of build steps, we need to have all steps to compile in the root ListBuildStep. More especially, if we have a ListBuildStep container in which we want to put a step A that depends on a step B and C, we need to put A, B, C in the ListBuildStep container. This is cumbersome and error-prone. What we would like to do is to rely only on the PrerequisiteSteps of a given step to find what we have to compile. If we do so, we wouldn't need to return a ListBuildStep in AssetCompilerResult, but just the final build step for the asset, the graph of dependent build steps being described by recursive PrerequisiteSteps. The ListBuildStep container could be removed. We would still need to have lists of build steps when we compile multiple asset (eg. when compiling the full game), but it would be nothing that the build engine should be aware of.

Commands

Most command inherits from IndexFileCommand, which automatically register the output of the command into the command context.

Basically, at the beginning of the command (in the PreCommand method), a BuildTransaction object is created. This transaction contains a subset of the database of objects that have been already compiled, provided by the ICommandContext.GetOutputObjectsGroups(). In term of implementation, this method returns all the objects that where written by prerequisite build steps, and all the objects that are already written in any of the parent ListBuildSteps, recursively. The objects coming from the parent ListBuildStep are a legacy of when we were using WaitBuildStep to synchronize the build steps. This hopefully should be implemented differently, relying only on prerequisite (since no synchronization can happen in the `ListBuildStep itself, everything is run in parallel).

TODO: Rewrite how OutputObjects are transfered from BuildSteps to other BuildSteps. Only the output from prerequisite BuildStep should be transfered. A lot of legacy makes this code very convoluted and hard to maintain.

The BuildTransaction created during this step is mounted as a Microthread-local database, which is accessible only from the current microthread (which is basically the current command).

At the end of the command (in the PostCommand method), every object that has been written in the database by the command are extracted from the BuildTransaction and registered to the current ICommandContext (which is how the ICommandContext can "flow" objects from one command to the other.

It's important to keep in mind that objects accessible in a given command (in the DoCommandOverride) using a ContentManager are those provided during the PreCommand step, and therefore it is important that dependencies between commands (what other commmands a command needs to be completed to start) are properly set.

Compilers

Compilers are classes that generate a set of BuildSteps to compile a given Asset in a specific context. This list could grow in the future if we have other needs, but the current different contexts are:

compiling the asset for the game
compiling the asset for the scene editor
compiling the asset to display in the preview
compiling the asset to generate a thumbnail

IAssetCompiler

This is the base interface for compiler. The entry point is the Prepare method, which takes an AssetItem and returns a AssetCompilerResult, which is a mix of a LoggerResult and a ListBuildStep. Usually there are two implementations per asset types, one to compile asset for the game and one to compile asset for its thumbnails. Some asset types such as animations might have an additional implementation for the preview.

Each implementation of IAssetCompiler must have the AssetCompilerAttribute attached to the class, in order to be registered (compilers are registered via the AssetCompilerRegistry.

TODO: The AssetCompilerRegistry could be merged into the AssetRegistry to have a single location where asset-related types and meta-information are registered.

Each compiler provides a set of methods to help discover the dependencies between assets and compilers. They will be covered later in this document.

ICompilationContext

Not to be mistaken with CompilerContext and AssetCompilerContext.

Contexts of compilation are defined by types, which allow to use inheritance mechanism to fallback on a default compiler when there is no specific compiler for a given context. Each compilation context type must implement ICompilationContext. Currently we have:

AssetCompilationContext is the context used when we compile an asset for the runtime (ie. the game).
EditorGameCompilationContext is the context used when we compile an asset for the scene editor, which is a specific runtime. Therefore, it inherits from AssetCompilationContext.
PreviewCompilationContext is the context used when we compile an asset for the preview, which is a specific runtime. Therefore, it inherits from AssetCompilationContext.
ThumbnailCompilationContext is the context used when we compile an asset to generate a thumbnail. Generally, for thumbnails, we compile one or several assets for the runtime, and use additional steps to generate the thumbnail with the ThumbnailCompilationContext (see below).

TODO: Currently thumbnail compilation is in a poor state. In ThumbnailListCompiler.Compile, we first generate the steps to compile the asset in PreviewCompilationContext, then generate the steps to compile the asset in ThumbnailCompilationContext, and finally we like the first with the latter. Dependencies from thumbnail compilers (which load a scene and take screenshots) to the runtime compiler (which compile the asset) is not expressed at all. It just works now because in all current cases, the PreviewCompilationContext does what we need for thumbnails (for example, the AnimationAssetPreviewCompiler adds the preview model to the normal compilation of the animation, which is needed for both preview and thumbnail).

Dependency managers

We currently have two mechanisms that handle dependencies.

TODO: Merge the AssetDependencyManager and the BuildDependencyManager together into a single dependency manager object. There is a lot of redundancy between both, one rely on the other, some code is duplicated. See XK-4862

AssetDependencyManager

The AssetDependencyManager was the first implementation of an mechanism to manage dependencies between assets. It works independently of the build, which is one of the main issue it had and the reason why we started to develop a new infrastructure.

It is based essentially on visiting assets with a DataVisitorBase to find references to other assets. There are two ways of referencing an asset:

Having a property whose type is an implementation of IReference. More explicitely the only case we have currently is AssetReference. This type contains an AssetId and a Location corresponding to the referenced asset.
Having a property whose type correspond to a Content type, ie. a type registered as being the compiled version of an asset type (for example, Texture is the Content type of TextureAsset).

The problem of that design was that once all the references are collected, there is no way to know of the referenced assets are actually consumed, which could be one of the three following way:

the referenced asset is not needed to compile this asset, but it's needed at runtime to use the compiled content (eg. Models need Materials, who need Textures. But you can compile Models, Materials and Textures independently).
the referenced asset needs to be compiled before this asset, and the compiler of this asset needs to load the corresponding content generated from the referenced asset (eg. A prefab model, which aggregates multiple models together, needs the compiled version of each model it's referencing to be able to merge them).
the referenced asset is read when compiling this asset because it depends on some of its parameter, but the referenced asset itself doesn't need to be compiled first (eg. Navigation Meshes need to read the scene asset they are related to in order to gather static colliders it contains, but they don't need to compile the scene itself).

BuildDependencyManager

The BuildDependencyManager has been introduced recently to solve the problems of the AssetDependencyManager. It is currently not complete, and the ultimate goal is to merge it totally with the AssetDependencyManager.

The approach is a bit different. Rather than extracting dependencies from the asset itself, we extract them from the compilers of the assets, which are better suited to know what they exactly need to compile the asset and what will be needed to load the asset at runtime.

But one asset type can have multiple compilers associated to it (for the game, for the thumbnail, for the preview...). So the BuildDependencyManager works in the context of a specific compiler.

Currently there is one BuildDependencyManager for each type of compiler.

TODO: Have a single global instance of BuildDependencyManager that contains all types of dependencies for all context of compilers. For example, we have thumbnail compilers that requires game version of assets, which means that the BuildDependencyManager for thumbnails will also contain a large part of the BuildDependencyManager to build the game. Merging everything into a single graph would reduce redundancy and risk to trigger the same operation multiple times simultaneously.

AssetDependenciesCompiler

The AssetDependenciesCompiler is the object that computes the dependencies with the BuildDependencyManager, and then generates the build steps for a given asset, including the runtime dependencies. It's the main entry point of compilation for the CompilerApp, the scene editor, and the preview. Thumbnails also use it, via the ThumbnailListCompiler class.

TODO: This class should be removed, and its content moved into the BuildDependencyManager class. By doing so, it should be possible to make BuildAssetNode and BuildAssetLink internal - those classes are just the data of the dependency graph, they should not be exposed publicly. To do that, a method to retrieve the dependencies in a given context must be implemented in BuildDependencyManager in order to fix the usage of BuildAssetNode in EditorContentLoader.

In the Game Studio

The Game Studio compiles assets in various versions all the time. It has some specific way of managing database and content depending on the context.

Remark: the Game Studio never saves index file on the disk, it keeps the url -> hash mapping in memory, always.

Databases

Before accessing content to load, a Microthread-local database must be mounted. Depending on the context, it can be a database containing a scene and its dependencies (scene editor), the assets needed to create a thumbnail, an asset to display in the preview...

For the scene editor, this is handled by the GameStudioDatabase class. Thumbnails and preview also handle database mounting internally (in ThumbnailGenerator for example).

TODO: See if it could be possible/useful to wrap all database-mounting in the Game Studio into the GameStudioDatabase class.

Builder service

All compilations that occur in the Game Studio is done through the GameStudioBuilderService. This class creates an instance of Builder, a DynamicBuilder which allows to feed the Builder with build steps at any time. Having a single builder for the whole Game Studio allows to control the number of threads and concurrent tasks more easily.

The DynamicBuilder class simply creates a thread to run the Builder on, and set a special build step, DynamicBuildStep, as root step of this builder. This step is permanently waiting for other child build step to be posted, and execute them.

TODO: Currently the dynamic build step waits arbitratly with the CompleteOneBuildStep method when more than 8 assets compiling. This is a poor design because if the 8 assets are for example prefabs who contains a lot of models, materials, textures, it will block until all are done, although we could complete the thumbnails of these models/materials/textures individually. Ideally, this await should be removed, and a way to make sure thumbnails of assets which are compiled are created as soon as possible should be implemented.

The builder service uses AssetBuildUnits as unit of compilation. A build unit corresponds to a single asset, and encapsulates the compiler and the generated build step of this asset.

EditorContentLoader

The scene editor needs a special behavior in term of asset loading. The main issue is that any type of asset can be modified by the user (for example a texture), and then need to be reloaded. Stride use the ContentManager to handle reference counting of loaded assets. With a few exception (Materials, maybe Textures), it does not support hot-swapping an asset. Therefore, when an asset needs to be reloaded, we actually need to unload and reload the first-referencer of this asset.

The first-referencer is the first asset referenced by an entity, that contains a way (in term of reference) to the asset to reload. For example, in case of a texture, we will have to reload all models that use materials that use the texture to reload.

This is done by the EditorContentLoader class. At initialization, this class collects all first-referencer assets and build them. Each time an asset is built, it is then loaded into the scene editor game, and the references (from the entity to the asset) are updated. This means that this class needs to track all first-referencers on its own and update them. This is done specifically by the LoaderReferenceManager object. The reference are collected from the GameEditorChangePropagator, an object that takes the responsibility to push synchronization of changes between the assets and the game (for all properties, including non-references). There is one instance of it per entity. When a property of an entity that contains a reference to an asset (a first-referencer) is modified, the propagator will trigger the work to compile and update the entity. In case of a referenced asset modified by the user, EditorContentLoader.AssetPropertiesChanged takes the responsibility to gather, build, unload and reload what needs to be reloaded.

Additional Todos

TODO: GetInputFiles exists both in Command and in IAssetCompiler. It has the same signature in both case, so it's returning information using ObjectUrl and UrlType in the compiler, where we are trying to describe dependency. That signature should be changed, so it returns information using BuildDependencyType and AssetCompilationContext, just like the GetInputTypes method. Also, the method is passed to the command via the InputFilesGetter which is not very nice and has to be done manually (super error-prone, we had multiple commands that were missing it!). An automated way should be provided.

TODO: The current design of the build steps and list build steps is a tree. For this reason, same build steps are often generated multiple times and appears in multiple trees. It could be possible to cache and share the build step if the structure was a graph rather than a tree. Do to that, the Parent property of build steps should be removed. The main difficulty is that the way output objects of build steps flow between steps has to be rewritten.

Table of Contents