Well-behaved guest code

A brief guide to code injection etiquette
Published on Monday, May 30, 2022
Photo by Jon Tyson on Unsplash

Did you write a source generator (better yet, an incremental generator)? Or does your NuGet package add Compile items to projects, either using ready-made source files or generating them on the fly with a MSBuild task?

Whatever the case, your noble intent notwithstanding, let me give you a user's perspective: your kids are treading on my lawn and I fear for my flowers.

Muddy footprints all over the place

You see, once you add your source files to a project of mine, they become sort of my responsibility. For example:

  • if we have different code styles, code style analyzers will start spitting warnings all over my project, demanding that I correct your code (which, of course, needs no correcting, least of all by yours truly);
  • code coverage tools will insist that, since your code is now part of the project, my unit tests should cover it;
  • code metrics tools will unnecessarily take into account your code as well as mine;
  • the same goes for StyleCop analyzers, Public API analyzers... I think you get the picture by now;
  • last but not least, while debugging my code I can find myself stepping through your methods, which is both distracting (debugging carries a high-enough cognitive load already) and unexpected (can't it just behave like all other dependencies?).

All this unless your code is well-behaved, of course. Here I mean both code you generated for me, and code you wrote, included in a NuGet package, and injected into my project via Compile items; for lack of a better term, I'll collectively call it guest code.

Who's a good kid?

So what does it mean for guest code to be well-behaved? Turns out it depends upon whom you ask:

  • code analyzers will refrain from complaining about code in source files marked as auto-generated (we'll see how in a minute);
  • coverage tools can usually be instructed to both ignore code on a file-per-file basis, and use attributes to ignore single types and/or members;
  • code metrics tools use attributes to ignore whole types;
  • debuggers can skip types and/or members, based on... yep, you guessed it: yet another attribute.

Let's see how guest code can be "educated" to not wreak havoc on other people's projects.

How to exclude guest code from code analysis

The fact that a source file has been automatically generated (or at least automatically injected into a project) is determined by Roslyn before deciding whether to pass the file to analyzers, or whether to actually issue diagnostics generated by analyzers on the file - details are a bit sketchy to me, to be honest, as Roslyn is a really huge project to look into. The outcome, however, is the same in both cases: if a source file is found to be a guest, no error, warning, or even diagnostic message will be shown for it.1

Roslyn considers a source file "generated", thus excluding it from code analysis, if at least one of the following conditions is satisfied:

  • the file name (excluding the .cs or .vb extension)2 ends with .designer, .generated, .g, or .g.i. This test is case-insensitive, so MyClass.g.cs, MyClass.G.cs, and MyClass.GenERateD.cs will equally do;
  • the file contains a comment, placed before the first token of actual code, that contains (in any position) one of the two strings <autogenerated and <auto-generated (no actual XML involved). The comparison is case-sensitive, so // <AutoGenerated> and // <AUTO-generated> will not work.

The code that checks for these conditions is pretty straightforward; it is contained in the aptly-named GeneratedCodeUtilities class, in case you'd like to take a look.

So, which of the two methods is better? Roslyn doesn't care either way, but I'd say do both; here's why.

The file name may help code coverage tools identify files to ignore; for example, Coverlet has an "exclude by file" feature that accepts wildcards, as in *.g.cs. The file name is very easy to check without even opening the file, which makes it rather likely that other third-party tools might implement similar filtering features.

RULE OF THE HOUSE #1: Every guest source file's name MUST end in .g (recommended for brevity) or .generated, plus the appropriate extension according to the source language (.cs / .vb).

On the other hand, the <auto-generated comment will help users browsing source code. It says "do not modify this code: it may be regenerated, and you'll lose your changes". Quite useful, if you ask me, especially during frantic debugging sessions.

A special note to source generator authors: the name you "give" to a source file is only a suggestion. Notice how the first parameter of GeneratorExecutionContext.AddSource is called "hintName", not "fileName"? Roslyn may decide to change it for whatever reason, even if, as of version 4.2, it practically never does.3 This leaves the initial comment as the only safe way to mark a source file as auto-generated in source generators.

RULE OF THE HOUSE #2: Every guest source file MUST begin with one or more comments (either single- or multi-line) of which one MUST contain either the string <autogenerated or <auto-generated.

For the sake of clarity and completeness, the initial comment(s) should contain the "magical" string as part of a XML tag (just in case some third-party tool requires or expects it), and should include a human-readable notice about the file not being user-modifiable. For example:

// <auto-generated>
// This file was automatically generated by {{packageName}}. DO NOT MODIFY!
// </auto-generated>

For a file that was not automatically generated, the notice wording may be different. For example:

// <auto-generated>
// This file is part of {{packageName}}. DO NOT MODIFY!
// </auto-generated>

How to exclude generated code from code coverage

File name filtering is of course good to have, but source files do not always have a 1-1 correspondence with types. Code coverage tools can therefore be instructed to exclude types and/or members based on the presence of a specified attribute: two examples are Coverlet and AltCover. If only we could all agree on which attribute to use...

Luckily for us all, Microsoft solved this way back in 2010: ExcludeFromCodeCoverageAttribute was born with .NET Framework 4.0 and is available in every target framework currently supported by Microsoft. This, my dear package author, leaves you with no excuse.

RULE OF THE HOUSE #3: Every guest type or member MUST have an ExcludeFromCodeCoverage attribute.

How to exclude generated code from code metrics

There are currently three kinds of tools to measure code metrics in .NET. On one side we have code metrics quality analyzers that, being analyzers, are already dealt with by rules #1 and #2. On the other side, though, there are Visual Studio's "Calculate Code Metrics" menu command and command-line code metrics, that could not care less about the name of a source file, or the comments it starts with.

Microsoft's documentation states, I quote, "Mostly, Code Metrics ignores generated code when it calculates the metrics values". Then they don't even try to explain how code metrics tools can tell generated code apart.

A little digging in Roslyn analyzers' source code, however, reveals the trick: types that have either a GeneratedCode or a CompilerGenerated attribute are exempt from code metrics measurements. If you're curious, take a look inside the MetricsHelper class.

Of the two attributes I just mentioned, CompilerGenerated is reserved for use by the compiler to mark, for example, default contructors and accessors of auto-implemented properties. Citing an old article from the Code Analysis team blog (emphasis mine):

CompilerGeneratedAttribute

This attribute is for compiler use only and indicates that a particular code element is compiler generated. This should never be used in source code whatsoever. In fact, some users believe that usage of it should be a compilation error. I tend to agree.

This leaves you with the GeneratedCode attribute, whose only constructor takes as parameters the name and version of the tool that generated the code, like this:

using System.CodeDom.Compiler;

namespace MyPackage
{
    [GeneratedCode("MyPackage", "vX.Y.Z")]
    class MyClass
    {
        // etc.
    }
}

Each parameter of the GeneratedCodeAttribute constructor can be null, which is hardly surprising in a class that dates back to .NET Framework 2.0, long before C# 8.0 introduced nullable reference types. My advice is to play it safe by providing both strings.

Despite GeneratedCodeAttribute being applicable to single members too, code metrics tools will only check it on types. There is no way of excluding single members from code metrics. However, when adding members to a partial type, it certainly doesn't hurt to mark them with GeneratedCode, as some other third-party tool could have a good use for it.

RULE OF THE HOUSE #4: Every guest type or member MUST have a GeneratedCode attribute, constructed with the name and version of the package it comes from.

How to mark generated code for debuggers

Just add a DebuggerNonUserCode attribute to every type you inject into a project. If you add members to a partial type, add the attribute to every added member instead.

This one was surprisingly easy to find out, thanks to a documentation page with a meaningful "Remarks" section. This brings us straight to

RULE OF THE HOUSE #5: Every guest type or member MUST have a DebuggerNonUserCode attribute.

Putting it all together

If you are still reading this, my dear package author, you could be thinking that I wrote this article to metaphorically slap you and enjoy your alleged humiliation; or that I'm just a control freak with a blog and too much free time.

Although you certainly remain fully entitled to your opinion, I'm not one to just spit out rules. Instead, I like to offer simple, easily reproducible procedures to comply with rules, and this article is no exception. Here's a brief, step-by-step guide to producing well-behaved guest code.

0. A premise

Let's say we are the authors of a package named Whammo.4 Whammo injects code in user projects all sorts of ways:

  • it augments user classes with additional methods via a source generator;
  • the same generator also dynamically generates some helper classes;
  • last but not least, Whammo injects some of its own source files in the compilation via a build\Whammo.targets file, which MSBuild automatically imports in every dependent project.

Our goal is to ensure a developer experience as smooth as possible for Whammo users, by following the rules laid out above.

1. Establish some constants

Since we're going to write a bunch of GeneratedCode attributes, each requiring our package's name and version as parameters, it makes sense to have them available as constants. To this end, let's add one more source file, named WhammoConstants.g.cs, to the user's project:

// <auto-generated>
// This file is part of Whammo. DO NOT MODIFY!
// </auto-generated>

[System.Diagnostics.DebuggerNonUserCode]
[System.Diagnostics.CodeAnalysis.ExcludeFromCodeCoverage]
[System.CodeDom.Compiler.GeneratedCode(WhammoConstants.Name, WhammoConstants.Version)]
internal static class WhammoConstants
{
    public const string Name = "Whammo";
    public const string Version = "v1.2.3";
}

This file may be generated by a source generator (in which case the initial comment should better read "This file was automatically generated by Whammo"), or written as-is and added as a Compile item. Or even automatically generated in the Whammo project, then packaged and added as a Compile item in user projects.

Notice that the WhammoConstants class is in the global namespace. This makes it easy to refer to it both from Whammo's own code (which probably is in the Whammo namespace) and from generated code (which, at least as long as partial types are involved, will forcibly reside in a user-defined namespace).

2. Generate well-behaved types

Here's how a well-behaved, automatically generated type may look like.

// <auto-generated>
// This file was automatically generated by Whammo. DO NOT MODIFY!
// </auto-generated>

namespace Whammo
{
    /// <summary>
    /// Goes all WHAM! on foobars.
    /// </summary>
    [System.Diagnostics.DebuggerNonUserCode]
    [System.Diagnostics.CodeAnalysis.ExcludeFromCodeCoverage]
    [System.CodeDom.Compiler.GeneratedCode(WhammoConstants.Name, WhammoConstants.Version)]
    internal class GeneratedClass
    {
        // etc.
    }
}

Needless to say, the name of the file containing the above code should be GeneratedClass.g.cs.

Once you mark a type with the necessary attributes, there is no need to also mark its members. Things change, however, when you add members to a pre-existing type, as we'll see straight away.

3. Augment user types with well-behaved members

Although at this point you've probably guessed it by yourself, here's how a well-behaved, automatically generated method may look like.

// <auto-generated>
// This file was automatically generated by Whammo. DO NOT MODIFY!
// </auto-generated>

namespace UserNamespace
{
    partial class UserClass
    {
        /// <summary>
        /// Goes places, does things.
        /// </summary>
        [System.Diagnostics.DebuggerNonUserCode]
        [System.Diagnostics.CodeAnalysis.ExcludeFromCodeCoverage]
        [System.CodeDom.Compiler.GeneratedCode(WhammoConstants.Name, WhammoConstants.Version)]
        void GeneratedMethod()
        {
            // etc.
        }
    }
}

The file name must be chosen wisely here, as the user may have two UserClass types in different namespaces. Although the presence of the initial comments will help mitigate any damage Roslyn may inflict by renaming one or more of the files, it's fairly easy to construct a name like UserNamespace-UserClass.g.cs to avoid ambiguities.

If we augment the same type in two distinct workflows of the same incremental generator, we can use a suffix to distinguish generated files. For example, the above code may be in UserNamespace-UserClass-GeneratedMethod.g.cs. There are no strict rules here, just do your best to preserve the dot-g part.

4. Educate your own types

If a type was written for the exclusive purpose of adding it to user projects, marking its source code with the usual set of comments and attributes is pretty straightforward. Let's take a look at WhammoHelper.g.cs:

// <auto-generated>
// This file is part of Whammo. DO NOT MODIFY!
// </auto-generated>

namespace Whammo
{
    /// <summary>
    /// Helps other classes go WHAM!
    /// </summary>
    [System.Diagnostics.DebuggerNonUserCode]
    [System.Diagnostics.CodeAnalysis.ExcludeFromCodeCoverage]
    [System.CodeDom.Compiler.GeneratedCode(WhammoConstants.Name, WhammoConstants.Version)]
    internal static class WhammoHelper
    {
        // etc.
    }
}

Things get a bit more involved for types that can be either compiled as public in their own project (say Whammo.dll), or included as internal in user projects. We obviously don't want to skew coverage or code metrics measures in the Whammo project, so we only need the attributes in the second case.

Fortunately, we can exploit the C# preprocessor to have our cake and eat it, too. First we have to define a WHAMMO_DLL preprocessor symbol in our project file Whammo.csproj:

    <PropertyGroup>
      <DefineConstants>$(DefineConstants);WHAMMO_DLL</DefineConstants>
    <PropertyGroup>

Then we can use the symbol as an indicator that WhammoHelper is being compiled inside our project:

// <auto-generated>
// This file is part of Whammo. DO NOT MODIFY!
// </auto-generated>

namespace Whammo
{
    /// <summary>
    /// Helps other classes go WHAM!
    /// </summary>
#if WHAMMO_DLL
    public
#else
    [System.Diagnostics.DebuggerNonUserCode]
    [System.Diagnostics.CodeAnalysis.ExcludeFromCodeCoverage]
    [System.CodeDom.Compiler.GeneratedCode(WhammoConstants.Name, WhammoConstants.Version)]
    internal
#endif    
    static class WhammoHelper
    {
        // etc.
    }
}

Naming this source file WhammoHelper.g.cs will do no harm to our project... unless we ourselves use some other generator. In this case, code coverage tools won't be able to tell our files apart from generated files (remember, WhammoHelper is not a "generated" file here - it's part of the project).

The easiest solution, when practicable, is to configure code coverage tools to include / exclude files based on their on-disk path. A more elegant solution would be to omit the .g in our filenames, then copying them over to a new folder adding the .g just before packing.

Conclusion

The .NET ecosystem makes it fairly easy to augment other developers' projects with our own code. We should, however, take some steps to avoid turning a useful dependency into an annoyance.

Happy programming!


  1. This of course does not include compiler errors - only analyzer diagnostics.

  2. I know there are other .NET compilers beside Roslyn, and other programming languages beside C# and Visual Basic, but they are totally outside my area of expertise. This whole article could well be moot for them AFAIK.

  3. As long as you don't try to generate two identically-named source files from the same generator.

  4. Weird name, maybe, but a lot less boring than "Foobar".