Writing Prometheus exporters – the Lazy Dev way

This article is part of the C# Advent Series. Christmas has a special place in our hearts and this event is also a wonderful way to help build up the C# community. Do check out awesome content from other authors!

There’s a couple of things about Christmas in Southern Hemisphere that tends to hit us pretty hard each year: first, the fact that it is summer and it’s scorching hot outside. And second – is a customary closedown of almost all businesses (which probably started as response to the first point). Some businesses, however, keep boxing on.

One of our clients is into cryptocurrency mining and they could not care less about staff wanting time off to spend with family. Their only workforce are GPUs, and these devices can work 24/7. However, with temperatures creeping up, efficiency takes a hit. Also, other sad things can happen:

Solution design

Our first suggestion was to use our trusty ELK+G and poll extra data from NVIDIA SMI tool, but we soon figured out that this problem has already been solved for us. Mining software nowadays got extremely sophisticated (and obfuscated) – it now comes with own webserver and API. So, we simplified a bit:

All we have to do here would be to stand up an exporter and set up a few dashboards. Easy.

Hosted Services

We essentially need to run two services: poll underlying API and expose metrics in Prometheus-friendly format. We felt .NET Core Generic host infrastructure would fit very well here. It allows us to bootstrap an app, add Hosted Services and leave plumbing to Docker. The program ended up looking like so:

class Program
    {
        private static async Task Main(string[] args)
        {
            using IHost host = CreatHostBuilder(args).Build();
            
            await host.RunAsync();
        }
        static IHostBuilder CreateHostBuilder(string[] args) =>
            Host.CreateDefaultBuilder(args)
                .ConfigureAppConfiguration((configuration) =>
                {
                    configuration.AddEnvironmentVariables("TREX")
; // can add more sources such as command line
                })
                .ConfigureServices(c =>
                {
                    c.AddSingleton<MetricCollection>(); // This is where we will keep all metrics state. hence singleton
                    c.AddHostedService<PrometheusExporter>(); // exposes MetricCollection
                    c.AddHostedService<TRexPoller>(); // periodically GETs status and updates MetricCollection
                });
    }

Defining services

The two parts of our applicatgion are TRexPoller and PrometheusExporter. Writing both is trivial and we won’t spend much time on the code there. Feel free to check it out on GitHub. The point to make here is it has never been easier to focus on business logic and leave heavy lifting to respective NuGet packages.

Crafting the models

The most important part of our application is of course telemetry. We grabbed a sample json response from the API and used an online tool to convert that into C# classes:

// generated code looks like this. A set of POCOs with each property decorated with JsonProperty that maps to api response
public partial class Gpu
{
    [JsonProperty("device_id")]
    public int DeviceId { get; set; }
    [JsonProperty("hashrate")]
    public int Hashrate { get; set; }
    [JsonProperty("hashrate_day")]
    public int HashrateDay { get; set; }
    [JsonProperty("hashrate_hour")]
    public int HashrateHour { get; set; }
...
}

Now we need to define metrics that Prometheus.Net can later discover and serve up:

// example taken from https://github.com/prometheus-net/prometheus-net#quick-start
private static readonly Counter ProcessedJobCount = Metrics
    .CreateCounter("myapp_jobs_processed_total", "Number of processed jobs.");
...
ProcessJob();
ProcessedJobCount.Inc();

Turning on lazy mode

This is where we’ve got so inspired by our “low code” solution that we didn’t want to get down to hand-crafting a bunch of class fields to describe every single value the API serves. Luckily, C#9 has a new feature just for us: Source Code Generators to the rescue! We’ve covered the basic setup before, so we’ll skip this part here and move on to the Christmas magic part.

Let Code Generators do the work for us

Before we hand everything over to robots, we need to set some basic rules to control the process. Custom attributes looked like a sensible way to keep all configuration local with the model POCOs:

[AddInstrumentation("gpus")] // the first attribute prompts the generator to loop through the properties and search for metrics 
public partial class Gpu
{
    [JsonProperty("device_id")]
    public int DeviceId { get; set; }
    [JsonProperty("hashrate")]
    /*
     * the second attribute controls which type the metric will have as well as what labels we want to store with it.
     * In this example, it's a Gauge with gpu_id, vendor and name being labels for grouping in Prometheus
     */
    [Metric("Gauge", "gpu_id", "vendor", "name")]
    public int Hashrate { get; set; }
    [JsonProperty("hashrate_day")]
    [Metric("Gauge", "gpu_id", "vendor", "name")]
    public int HashrateDay { get; set; }
    [JsonProperty("hashrate_hour")]
    [Metric("Gauge", "gpu_id", "vendor", "name")]
    public int HashrateHour { get; set; }

Finally, the generator itself hooks into ClassDeclarationSyntax and looks for well-known attributes:

public void OnVisitSyntaxNode(SyntaxNode syntaxNode)
    {
        if (syntaxNode is ClassDeclarationSyntax cds && cds.AttributeLists
            .SelectMany(al => al.Attributes)
            .Any(a => (a.Name as IdentifierNameSyntax)?.Identifier.ValueText == "AddInstrumentation"))
        {
            ClassesToProcess.Add(cds);
        }
    }

Once we’ve got our list, we loop through each property and generate a dictionary of Collector objects.

var text = new StringBuilder(@"public static Dictionary<string, Collector> GetMetrics(string prefix)
    {
        var result = new Dictionary<string, Collector>
        {").AppendLine();
foreach (PropertyDeclarationSyntax p in properties)
{
    var jsonPropertyAttr = p.GetAttr("JsonProperty");
    var metricAttr = p.GetAttr("Metric");
    if (metricAttr == null) continue;
    var propName = jsonPropertyAttr.GetFirstParameterValue();
    var metricName = metricAttr.GetFirstParameterValue(); // determine metric type
    if (metricAttr.ArgumentList.Arguments.Count > 1)
    {
        var labels = metricAttr.GetTailParameterValues(); // if we have extra labels to process - here's our chance 
        text.AppendLine(
            $"{{$\"{{prefix}}{attrPrefix}_{propName}\", Metrics.Create{metricName}($\"{{prefix}}{attrPrefix}_{propName}\", \"{propName}\", {commonLabels}, {labels}) }},");
    }
    else
    {
        text.AppendLine(
            $"{{$\"{{prefix}}{attrPrefix}_{propName}\", Metrics.Create{metricName}($\"{{prefix}}{attrPrefix}_{propName}\", \"{propName}\", {commonLabels}) }},");
    }
}
text.AppendLine(@"};
                return result;
            }");

In parallel to defining storage for metrics, we also need to generate code that will update values as soon as we’ve heard back from the API:

private StringBuilder UpdateMetrics(List<MemberDeclarationSyntax> properties, SyntaxToken classToProcess, string attrPrefix)
{
    var text = new StringBuilder($"public static void UpdateMetrics(string prefix, Dictionary<string, Collector> metrics, {classToProcess} data, string host, string slot, string algo, List<string> extraLabels = null) {{");
    text.AppendLine();
    text.AppendLine(@"if(extraLabels == null) { 
                            extraLabels = new List<string> {host, slot, algo};
                        }
                        else {
                            extraLabels.Insert(0, algo);
                            extraLabels.Insert(0, slot);
                            extraLabels.Insert(0, host);
                        }");
    foreach (PropertyDeclarationSyntax p in properties)
    {
        var jsonPropertyAttr = p.GetAttr("JsonProperty");
        var metricAttr = p.GetAttr("Metric");
        if (metricAttr == null) continue;
        var propName = jsonPropertyAttr.GetFirstParameterValue();
        var metricName = metricAttr.GetFirstParameterValue();
        var newValue = $"data.{p.Identifier.ValueText}";
        text.Append(
            $"(metrics[$\"{{prefix}}{attrPrefix}_{propName}\"] as {metricName}).WithLabels(extraLabels.ToArray())");
        switch (metricName)
        {
            case "Counter": text.AppendLine($".IncTo({newValue});"); break;
            case "Gauge": text.AppendLine($".Set({newValue});"); break;
        }
    }
    text.AppendLine("}").AppendLine();
    return text;
}

Bringing it all together with MetricCollection

Finally, we can use the generated code to bootstrap metrics on per-model basis and ensure we correctly handle updates:

internal class MetricCollection
{
    private readonly Dictionary<string, Collector> _metrics;
    private readonly string _prefix;
    private readonly string _host;
    public MetricCollection(IConfiguration configuration)
    {
        _prefix = configuration.GetValue<string>("exporterPrefix", "trex");
        _metrics = new Dictionary<string, Collector>();
        // this is where declaring particl classes and generating extra methods makes for seamless development experience
        foreach (var (key, value) in TRexResponse.GetMetrics(_prefix)) _metrics.Add(key, value);
        foreach (var (key, value) in DualStat.GetMetrics(_prefix)) _metrics.Add(key, value);
        foreach (var (key, value) in Gpu.GetMetrics(_prefix)) _metrics.Add(key, value);
        foreach (var (key, value) in Shares.GetMetrics(_prefix)) _metrics.Add(key, value);
    }
    public void Update(TRexResponse data)
    {
        TRexResponse.UpdateMetrics(_prefix, _metrics, data, _host, "main", data.Algorithm);
        DualStat.UpdateMetrics(_prefix, _metrics, data.DualStat, _host, "dual", data.DualStat.Algorithm);
        
        foreach (var dataGpu in data.Gpus)
        {
            Gpu.UpdateMetrics(_prefix, _metrics, dataGpu, _host, "main", data.Algorithm, new List<string>
            {
                dataGpu.DeviceId.ToString(),
                dataGpu.Vendor,
                dataGpu.Name
            });
            Shares.UpdateMetrics(_prefix, _metrics, dataGpu.Shares, _host, "main", data.Algorithm, new List<string>
            {
                dataGpu.GpuId.ToString(),
                dataGpu.Vendor,
                dataGpu.Name
            });
        }
    }
}

Peeking into generated code

Just to make sure we’re on the right track, we looked at generated code. It ain’t pretty but it’s honest work:

public partial class Shares {
public static Dictionary<string, Collector> GetMetrics(string prefix)
                {
                    var result = new Dictionary<string, Collector>
                    {
{$"{prefix}_shares_accepted_count", Metrics.CreateCounter($"{prefix}_shares_accepted_count", "accepted_count", "host", "slot", "algo", "gpu_id", "vendor", "name") },
{$"{prefix}_shares_invalid_count", Metrics.CreateCounter($"{prefix}_shares_invalid_count", "invalid_count", "host", "slot", "algo", "gpu_id", "vendor", "name") },
{$"{prefix}_shares_last_share_diff", Metrics.CreateGauge($"{prefix}_shares_last_share_diff", "last_share_diff", "host", "slot", "algo", "gpu_id", "vendor", "name") },
...
};
                            return result;
                        }
public static void UpdateMetrics(string prefix, Dictionary<string, Collector> metrics, Shares data, string host, string slot, string algo, List<string> extraLabels = null) {
if(extraLabels == null) { 
                                    extraLabels = new List<string> {host, slot, algo};
                                }
                                else {
                                    extraLabels.Insert(0, algo);
                                    extraLabels.Insert(0, slot);
                                    extraLabels.Insert(0, host);
                                }
(metrics[$"{prefix}_shares_accepted_count"] as Counter).WithLabels(extraLabels.ToArray()).IncTo(data.AcceptedCount);
(metrics[$"{prefix}_shares_invalid_count"] as Counter).WithLabels(extraLabels.ToArray()).IncTo(data.InvalidCount);
(metrics[$"{prefix}_shares_last_share_diff"] as Gauge).WithLabels(extraLabels.ToArray()).Set(data.LastShareDiff);
...
}
}

Conclusion

This example barely scratches the surface of what’s possible with this feature. Source code generators are extremely helpful when we deal with tedious and repetitive development tasks. It also helps reduce maintenance overheads by enabling us to switch to declarative approach. I’m sure we will see more projects coming up where this feature will become central to the solution.

If not already, do check out the source code in GitHub. And as for us, we would like to sign off with warmest greetings of this festive season and best wishes for happiness in the New Year.

Ingesting PowerShell-generated files into Azure Log Analytics? Watch out!

Windows PowerShell is an extremely useful tool when it comes to quickly churning out useful bits of automation. If these scripts run unattended, we’d often sprinkle logs to aid troubleshooting. What one does with these logs totally depends on application, but we’ve seen some decent Sentinel deployments with alerting and hunting queries (which is beside the point of today’s post).

We recently had a mysterious issue where we tried ingesting a log file into Azure Log Analytics Workspace but it never came through…

Ingesting custom logs

Generally speaking, this is a very simple operation. Spin up log analytics workspace and add a “Custom Logs” entry:

Finally, make sure to install Monitoring Agent on target machine and that’s it:

After a little while, new log entries would get beamed up to Azure:

Here we let our scripts run and went ahead to grab some drinks. But after a couple of hours and few bottles of fermented grape juice we realised that nothing happened…

What could possibly go wrong?

Having double- and triple- checked our setup everything looked solid. For the sake of completeness, our PowerShell script was doing something along the following lines:

$scriptDir = "."
$LogFilePath = Join-Path $scriptDir "log.txt"

# we do not overwrite the file. we always append
if (!(Test-Path $LogFilePath))
{
    $LogFile = New-Item -Path $LogFilePath -ItemType File
} else {
    $LogFile = Get-Item -Path $LogFilePath
}

 "$(Get-Date -Format "yyyy-MM-dd HH:mm:ss") Starting Processing" | Out-File $LogFile -Append

# do work

"$(Get-Date -Format "yyyy-MM-dd HH:mm:ss") Ending Processing" | Out-File $LogFile -Append

Nothing fancy, just making sure timestamps are in a supported format. We also made sure we do not rotate the file as log collection agent will not pick it up. So, we turned to the documentation:

  • The log must either have a single entry per line or use a timestamp matching one of the following formats at the start of each entry – ✓ check
  • The log file must not allow circular logging or log rotation, where the file is overwritten with new entries – ✓ check
  • For Linux, time zone conversion is not supported for time stamps in the logs – not our case
  • and finally, the log file must use ASCII or UTF-8 encoding. Other formats such as UTF-16 are not supported – let’s look at that a bit closer

Figuring this out

Looking at Out-File, we see that default Encoding is utf8NoBOM. This is exactly what we’re after, but examining our file revealed a troubling discrepancy:

That would explain why Monitoring Agent would not ingest our custom logs. Fixing this is rather easy, just set default output encoding at the start of the script: $PSDefaultParameterValues['Out-File:Encoding'] = 'utf8'.

But the question of how that could happen still remained…

Check your version

After a few more hours trying various combinations of inputs and PowerShell parameters, we checked $PSVersionTable.PSVersion and realised we ran PS5.1. This is where it started to click: documentation by default pointed us to the latest 7.2 LTS where the default encoding is different! Indeed, rewinding to PS5.1 reveals that the default used to be unicode: UTF-16 with the little-endian byte order.

Conclusion

Since PowerShell 7.x+ is not exclusive to Windows anymore, Microsoft seems to have accepted a few changes dependent on underlying behaviours of .NET frameworks these were built upon. There’s in fact an extensive list of breaking changes that mention encoding a few times. We totally support the need to advance tooling and converging tech. We, however, hope that as Monitoring Agent matures, more of these restrictions will get removed and this will not be an issue anymore. Until then – happy cloud computing!

Git SSH setup for VisualStudio

Every now and then we need to set ourselves up a new dev machine. And 99% of the time, that means setting up git source control. We believe that password authentication is a no-no, so we needed a quick way to bootstrap fresh Windows 10 install to use SSH key pairs.

This Is The Way

Setting things up would involve making sure OpenSSH is installed, ssh-agent is running and key pair is generated and registered with the agent. Finally, we’d go to http://dev.azure.com/{orgname}/_usersSettings/keys and paste public key in. This however is a laborious task, and most sources online seem to suggest doing it that way. We decided to simplify:

Set-ExecutionPolicy Bypass -Scope Process -Force; [System.Net.ServicePointManager]::SecurityProtocol = [System.Net.ServicePointManager]::SecurityProtocol -bor 3072; iex ((New-Object System.Net.WebClient).DownloadString('https://raw.githubusercontent.com/tkhadimullin/win-ssh-bootstrap/master/install.ps1'))

this will download and run the following:

if (-Not ([Security.Principal.WindowsPrincipal] [Security.Principal.WindowsIdentity]::GetCurrent()).IsInRole([Security.Principal.WindowsBuiltInRole] 'Administrator')) {
    Write-Warning  "Running as non-Admin user. Skipping environment checks"
} else {
    $capability = Get-WindowsCapability -Online | Where-Object Name -like "OpenSSH.Client*"

    if($capability.State -ne "Installed") {
        Write-Information "Installing OpenSSH client"
        Add-WindowsCapability -Online -Name $capability.Name
    } else {
        Write-Information "OpenSSH client installed"
    }

    $sshAgent = Get-Service ssh-agent
    if($sshAgent.Status -eq "Stopped") {$sshAgent | Start-Service}
    if($sshAgent.StartType -eq "Disabled") {$sshAgent | Set-Service -StartupType Automatic }
}

if([String]::IsNullOrWhiteSpace([Environment]::GetEnvironmentVariable("GIT_SSH"))) {
    [Environment]::SetEnvironmentVariable("GIT_SSH", "$((Get-Command ssh).Source)", [System.EnvironmentVariableTarget]::User)
}

$keyPath = Join-Path $env:Userprofile ".ssh\id_rsa" {
 # Assuming file name here
if(-not (Test-Path $keyPath)) { 
    ssh-keygen -q -f $keyPath -C "autogenerated_key" -N """" # empty password
    ssh-add -q -f $keyPath
} 

$line = Get-Content -Path "$($keyPath).pub" | Select-Object -First 1 # assuming file name and key index

Add-Type -AssemblyName System.Windows.Forms
Add-Type -AssemblyName System.Drawing
$form = New-Object System.Windows.Forms.Form
$form.Text = 'Your SSH Key'
$form.Size = New-Object System.Drawing.Size(600,150)
$form.StartPosition = 'CenterScreen'

$okButton = New-Object System.Windows.Forms.Button
$okButton.Location = New-Object System.Drawing.Point(260,70)
$okButton.Size = New-Object System.Drawing.Size(75,23)
$okButton.Text = 'OK'
$okButton.DialogResult = [System.Windows.Forms.DialogResult]::OK
$form.AcceptButton = $okButton
$form.Controls.Add($okButton)

$label = New-Object System.Windows.Forms.Label
$label.Location = New-Object System.Drawing.Point(10,10)
$label.Size = New-Object System.Drawing.Size(280,20)
$label.Text = 'Copy your key and paste into ADO:'
$form.Controls.Add($label)

$textBox = New-Object System.Windows.Forms.TextBox
$textBox.Location = New-Object System.Drawing.Point(10,30)
$textBox.Size = New-Object System.Drawing.Size(560,40)
$textBox.Text = $line
$textBox.ReadOnly = $true

$form.Controls.Add($textBox)
$form.Add_Shown({$textBox.Select()})
$form.Topmost = $true
$form.ShowDialog()

This script will take care of prerequisites (if run as admin) or try to generate a key in case everything else is done. Then it’ll paint a small window with public key:

The script makes a couple of assumptions about existing keys and will just roll with defaults. Nothing fancy at all. We also wanted to automate posting to ADO, but that did not happen (see below).

Setting up Visual Studio

Next order of business was to set up the IDE. It appears, Visual Studio would default to using password credentials, unless we set a GIT_SSH environment variable and point it to ssh.exe from OpenSSH distribution. The script will take care of that too.

Posting public key to Azure DevOps (not really)

ADO does not have an API for managing SSH keys. Therefore, generating PATs and service credentials will not going to help. We can try to make it happen by reverse engineering the front-end call and hoping it’s isolated enough for us to be able to repeat the procedure. Turns out, it’s indeed a matter of sending payload to https://dev.azure.com/{org}/_apis/Contribution/HierarchyQuery – this looks like a common message bus for ADO Extensions to post updates to:

{
    "contributionIds": [
        "ms.vss-token-web.personal-access-token-issue-session-token-provider"
    ],
    "dataProviderContext": {
        "properties": {
            "displayName": "key-name",
            "publicData": "ssh-rsa Aaaaaaaaaaaaaabbbbbb key-comment",
            "validFrom": "2021-11-30T08:00:00.000Z",
            "validTo": "2026-11-30T08:00:00.000Z",
            "scope": "app_token",
            "targetAccounts": [
                "xxxxxxxx-xxxx-xxxxx-xxxx-xxxxxxxxxxxx"
            ],
            "isPublic": true
        }
    }
}

The first issue waits us right in the payload: dataProviderContext.targetAccounts needs a value, but we could not find where to fetch it from. It’s loaded along with other content on the page, but opening it kind of eliminates the purpose of automating this task. And unfortunately, that’s not the only obstacle we’ve hit there.

Authentication

Front end relies on cookies to authenticate this request. We found that the only one we really need is UserAuthentication:

The value is standard JWT, issued by app.vstoken.visualstudio.com. Getting it requires us to register an app and have users go through oAuth flow. Also, since ADO works on concept of tenants and organisations, it is tricky to get the correct tenancy without interactive login. It seems doable, but we have deemed it to be not worth the effort. <sad_face_emoji_here>

Conclusion

Despite not being able to reach our fully automated nirvana, we’ve got to a state where we’d prep the system for SSH and surface the public key to copy-paste. It seems that reverse engineering the ADO frontend and extracting token from there is very much achievable, but at the stage we’d not pursue it. Publishing the code on GitHub gives us a faint hope the Community may push it across the line.

Setting up L2TP VPN with Mikrotik

For quite some time we wanted to be able to securely access our on-prem services, such as local NAS, IoT hub and Grafana. We have tried setting up PPTP but quickly realised that the technology has been long compromised. IPsec would be a great option, but it requires both ends of tunnel to have static IP addresses.

OpenVPN and AWS

Theoretically we can simply spin up an EC2 instance from the marketplace or even configure it manually, but we were feeling adventurous.

Setting up Client VPN Endpoint on AWS effectively stands up managed OpenVPN instance. We ended up not going with it (and we’ll get to reasons in a few moments), but let’s quickly go through steps one would need to take to pull it off. The setup is fairly complex and involved:

  1. Set up server certificate in AWS Certificate manager. Public certificates are free, but we had to go through DNS-based ownership validation, which is not that hard but takes anywhere between 15 minutes and few hours and we were not planning to use that domain name to connect to our server anyway.
  2. Make sure to pick up IP range that’s big enough (at least /22) and does not overlap with given VPC
  3. Stand up some sort of Directory Service for user authentication. Cognito is not an option, and we don’t have AD readily available. Creating full fat AD just for VPN seemed overkill, so we created Simple AD (which is still surplus to needs). It would’ve been fine, but to manage it, we had to stand up a Windows EC2. We of course joined it into the domain. And this stage it became obvious, that creating a virtual EC2 appliance would probably be way easier, but we decided to proceed for the sake of science.

  4. Finally, coming back to VPC we created a Gateway and VPN itself. One thing to keep in mind here is Transport protocol: Mikrotik only supports TCP. Yuck.
  5. All we have left to do now would be to download .ovpn file and use it to set up our router. But unfortunately, this is where our shenanigans will have to stop: RouterOS does not support AES-256-GCM.

L2TP scripts

Since we were standing up compute resources anyway, our goal shifted towards finding the easiest way to set things up. And IPsec VPN Server Auto Setup Scripts delivered just that! Just running wget https://git.io/vpnquickstart -O vpn.sh && sudo sh vpn.sh on a fresh EC2 instance did the trick for us. One thing to remember is to save auto-generated credentials the script prints on exit – that’s almost all VPN server setup done.

Since we had a router on the other end and wanted access to internal resources, we had to log in again and add couple of routes into /etc/ppp/ip-up.local:

#!/bin/bash
/sbin/route add -net 192.168.99.0/24 gw $4 # see for parameters: https://tldp.org/HOWTO/PPP-HOWTO/x1455.html

We also wanted to use conditional routing on the client side and only route certain client machines through the tunnel. For that, /etc/sysconfig/iptables needed a little update:

# Modified by hwdsl2 VPN script
*nat
:POSTROUTING ACCEPT [0:0]
# autogenerated code here
-A POSTROUTING -s 192.168.99.0/24 -j MASQUERADE # adding our own network so it gets NATted

COMMIT

Finally, we needed to enable L2TP through AWS NSG:

Mikrotik setup

With WinBox, setting up VPN in RouterOS is pretty straightforward:

you may notice we opted to not use the VPN as default route. This solution comes with tradeoffs, but in our case, we wanted to only tunnel specific clients. For that we have set up policy routing. Added a Mangle rule where we mark all connections from chosen hosts and then assigned new routing table to these packets:

Conclusion

It is a bit unfortunate that in 2021 Mikrotik still does not properly support OpenVPN. On the other hand, it exposes a lot of configurability to cater for uncommon network layouts. And now we got a bit closer to realising its full potential.

ASP.NET XML serialisation issues: observations on DataContractSerializer

A client reached out to us with a weird problem. Their ASP.NET WebAPI project (somewhat legacy tech) needed to communicate with an application running on a mainframe (dinosaur-grade legacy tech). But they were having XML serialisation issues…

They had a test XML payload, but only half of that kept coming across the wire:

broken serialisation – missing fields

first thing we suspected was missing DataContract/DataMember attributes, but everything seemed to be okay:

[DataContract(Name = "ComplexObject", Namespace = "")]
public class ComplexObject
{
    [DataMember]
    public int Id { get; set; }
    [DataMember]
    public string Name { get; set; }
    [DataMember]
    public string Location { get; set; }
    [DataMember]
    public string Reference { get; set; }
    [DataMember]
    public double Rate { get; set; }
    [DataMember]
    public double DiscountedRate { get; set; }
}

After scratching our heads for a little while and trying different solution from across the entirety of StackOverflow, we managed to dig up a piece of documentation that explained this behaviour:

  1. Data members of base classes (assuming serialiser will apply these rules upwards recursively);
  2. Data members in alphabetical order (bingo!);
  3. Data members specifically numbered in decorating attribute;

With the above in mind, we got the following payload to serialise successfully:

successful serialisation

There are other options

.NET comes with at least two XML serialisers: XmlSerializer and DataContractSerializer. A lot has been written about the two. We find this article written by Dan Rigsby to probably be the best source of information on the topic.

Key difference for us was the fact that XmlSerializer does not require any decorations and works out of the box. While DataContractSerializer needs us to make code changes. In our project everything was already set up with DataContract, so we did not have to change anything.

By default, WebAPI projects come configured to leverage DataContractSerializer. It however pays to know that in case of any issues we can switch to use XMLSerializer:

public static class WebApiConfig
{
    public static void Register(HttpConfiguration config)
    {
        config.Formatters.XmlFormatter.UseXmlSerializer = true; // global setting for all types
        config.Formatters.XmlFormatter.SetSerializer<ComplexObject>(new XmlSerializer(typeof(ComplexObject))); // overriding just for one type

Setting order

Yet another option to deal with ASP.NET XML serialisation issues would be to define property order explicitly:

[DataContract(Name = "ComplexObject", Namespace = "")]
public class ComplexObject
{
    [DataMember(Order = 1)]
    public int Id { get; set; }
    [DataMember(Order = 2)]
    public string Name { get; set; }
    [DataMember(Order = 3)]
    public string Location { get; set; }
    [DataMember(Order = 4)]
    public string Reference { get; set; }
    [DataMember(Order = 5)]
    public double Rate { get; set; }
    [DataMember(Order = 6)]
    public double DiscountedRate { get; set; }
}

Conclusion

XML serialisation has been around since the beginning of .NET. And even though it may seem that JSON has taken over, XML isn’t going anywhere any time soon. It is good to know we have many ways to deal with it should we ever need to.

EF Core 6 – custom functions with DbFunction Attribute

We’ve already looked at way to implement SQL functions via method translation. That went reasonably well, but next time we had to do something similar we discovered that our code is broken with newer versions of EF Core. We fixed it again.

Not anymore

Looking through changelogs, we noticed that EF Core 2.0 came with support for mapping scalar functions. It is remarkably simple to set up:

public class MyDbContext : DbContext
    {

        [DbFunction("DECRYPTBYPASSPHRASE", IsBuiltIn = true, IsNullable = false)]
        public static byte[] DecryptByPassphrase(string pass, byte[] ciphertext) => throw new NotImplementedException();

        [DbFunction("DECRYPTBYKEY", IsBuiltIn = true, IsNullable = false)]
        public static byte[] DecryptByKey(byte[] ciphertext) => throw new NotImplementedException();
...

and even easier to use:

var filteredSet = Set
                .Select(m => new Model
                {
                    Id = m.Id,
                    Decrypted = MyDbContext.DecryptByPassphrase("TestPassword", m.Encrypted).ToString(),
                    Decrypted2 = MyDbContext.DecryptByKey(m.Encrypted2).ToString(), // since the key's opened for session scope - just relying on it should do the trick
                }).ToList();

Initially the attribute was offering limited configuration options, but starting EF Core 5.0, this is not an issue.

One gotcha with DECRYPT* functions is they return varbinary. Trying to use our own EF.Functions.ConvertToVarChar is not going to work since we disabled custom plugins. We want to get rid of this code after all. But Apparently .ToString() works as intended:

SELECT [m].[Id], CONVERT(varchar(100), DECRYPTBYPASSPHRASE(N'TestPassword', [m].[Encrypted])) AS [Decrypted], CONVERT(varchar(100), DECRYPTBYKEY([m].[Encrypted2])) AS [Decrypted2], [t].[Id], [t].[IsSomething], [m].[Encrypted], [m].[Encrypted2]...

Full example source is in GitHub, along with other takes we decided to leave in place for history.

Conclusion

Defining custom EF functions was one of the biggest articles we wrote here. And finding out how to fit it together probably was the most challenging and time-consuming project we undertook in recorded history. One can say we totally wasted our time, but I’d like to draw a different conclusion. We had fun, learned something new and were able to appreciate the complexity behind Entity Framework – it is not just an engineering marvel – it is also a magical beast!

Azure Functions – OpenAPI + EF Core = 💥

Creating Swagger-enabled Azure Functions is not that hard to do. Visual Studio literally comes with a template for that:

Inspecting the newly created project we see that it comes down to one NuGet package. It magically hooks into IWebJobsStartup and registers additional routes for Swagger UI and OpenAPI document. When run, it reflects upon suitable entry points in the assembly and builds required responses on the fly. Elegant indeed.

Installing Entity Framework

Now, suppose, we need to talk to Azure SQL. So, we’d like to add EF Core to the mix. As much as we love to go for latest and greatest, unfortunately it’s a bit messy at the moment. Instead let’s get a bit more conservative and stick to EFCore 3.1.

We did not expect that, did we?

The error message is pretty clear: the assembly somehow did not get copied to the output location. And indeed, the file was missing:

Apparently when VS builds the function, it makes a second copy of the libraries it thinks are required. And in our case, it decided it’s not picking up the dependency. Adding <_FunctionsSkipCleanOutput>true</_FunctionsSkipCleanOutput> to the project file will fix that:

Are there yet?

Probably, but there’s a catch: our deployment package just got bigger. Alternatively, we could downgrade EF Core to 3.1.13 which happens to use the same version of Microsoft.Extensions.Logging.Abstractions. This way we’d avoid having to hack project files at expense or limiting ourselves to an older version of EF Core. Ultimately, we hope OpenAPI extension picks up the slack and goes GA soon. For now, looks like we’ll have to stick to it.

Web API – Dev environment in 120 seconds

Quite a few recent engagements saw us developing APIs for clients. Setting these projects up is a lot of fun at first. After a few deployments, however, we felt there should be a way to optimise our workflow and bootstrap environments a bit quicker.

We wanted to craft a skeleton project that would provide structure and repeatability. After quick validation we decided that Weather Forecast project is probably good enough as API starting point. With that part out of the way we also needed to have a client application that we could use while developing and handing the API over to the client.

Our constraints

Given the purpose of our template we also had a few more limitations:

  • Clean desk policy – the only required tools are Docker and VS Code (and lots or RAM! but that would be another day’s problem). Everything else should be transient and should leave no residue on host system.
  • Offline friendly – demos and handovers can happen on-site, where we won’t necessarily have access to corporate WiFi or wired network
  • Open Source – not a constraint per se, but very nice to have

With that in mind, our first candidate Postman was out, and after scratching heads for a little while we stumbled upon Hoppscotch. A “light-weight, web-based API development suite” as it says on the tin, it seems to deliver most of the features we’d use.

Setting up with Docker

There are heaps of examples on how to use Hoppscotch, but we haven’t seen a lot regarding self-hosting. Probably, because it’s fairly straightforward to get started:

docker run -it --rm -p3000 hoppscotch/hoppscotch

After that we should be able to just visit localhost:3000 and see the sleek UI:

hoppscotch UI

Building containers

Before we get too far ahead, let’s codify the bits we already know. We start VS Code and browse through a catalog of available dev containers… This time round we needed to set up and orchestrate at least two: we’ve got our app as well as Hoppscotch sitting in the same virtual network. That led us to opt for docker-from-docker-compose container template. On top of that, we enhanced it with dotnet SDK installation like our AWS Lambda container.

Finally, the docker-compose.yml needs Hoppscotch service definition at the bottom:

  hoppscotch:
    image: hoppscotch/hoppscotch
    ports:
      - 3000:3000

Should be smooth sailing from here: reopen in container, create a web API project and test away! Right?

A few quirks to keep in mind

As soon as we fired up the UI and tried making simple requests, we realised that Hoppscotch is not immune to CORS restrictions. Developers offer a couple of ways to fix this:

  1. enable CORS in the API itself – that’s what we ended up doing for now
  2. set up browser extension, but we couldn’t go that route as it would moot our clean desk policy. It also it not yet available in Microsoft Edge extension store
  3. finally, we can use proxyscotch but that looked like a rabbit hole we may want to explore later.

Authentication mechanism support is hopefully coming, so we’ll watch that space.

There’s one more interesting behaviour that caught us off guard: the client would silently fail SSL certificate check until we manually trusted the host in another tab. There are other more technical solutions but the easiest for now is to avoid SSL in development.

Hoppscotch ssl trust error

Conclusion

Once again, we used our weapon of choice and produced an artifact that enables us to develop and test containerised APIs faster!

end to end setup flow

AWS Lambda Dev – environment in 120 seconds

Okay, despite roaring success we had with the previous attempt at this, setting up VS Code dev containers for AWS SAM proved to be quite a bit of a pain. And we’re still not sure if it’s worth it. But it was interesting to set up and may be useful in some circumstances, so here we go.

Some issues we ran into

The biggest issue by far was the fact that SAM heavily relies on containers which for us means we’ll have to go deeper and use docker-in-docker dev container as a starting point. The base image there comes with bare minimum software and dotnet SDK is not part of it. So, we’ll have to install everything ourselves:

#!/usr/bin/env bash

set -e

if [ "$(id -u)" -ne 0 ]; then
    echo -e 'Script must be run as root. Use sudo, su, or add "USER root" to your Dockerfile before running this script.'
    exit 1
fi

curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install
rm -rf ./aws
rm ./awscliv2.zip
echo "AWS CLI version `aws --version`"

curl -L "https://github.com/aws/aws-sam-cli/releases/latest/download/aws-sam-cli-linux-x86_64.zip" -o "aws-sam-cli-linux-x86_64.zip"
unzip aws-sam-cli-linux-x86_64.zip -d sam-installation
sudo ./sam-installation/install
echo "SAM version `sam --version`"
rm -rf ./sam-installation
rm ./aws-sam-cli-linux-x86_64.zip

wget https://packages.microsoft.com/config/debian/11/packages-microsoft-prod.deb -O packages-microsoft-prod.deb
sudo dpkg -i packages-microsoft-prod.deb
rm packages-microsoft-prod.deb
sudo apt-get update; \
  sudo apt-get install -y apt-transport-https && \
  sudo apt-get update && \
  sudo apt-get install -y dotnet-sdk-3.1

# Installing lambda tools was required to get lambda to work while I was testing different approaches. It may have become redundant after so many iterations and changes to the script, but probably does not hurt
dotnet tool install -g Amazon.Lambda.Tools
export PATH="$PATH:$HOME/.dotnet/tools"

This is fairly straightforward: install AWS CLI and SAM as described in the documentation, and then install dotnet SDK. All we need to do now, is call it from the main Dockerfile.

It also helps to pre-populate container with extensions we’re going to need anyway:

"extensions": [
	"ms-azuretools.vscode-docker",
	"amazonwebservices.aws-toolkit-vscode",
	"ms-dotnettools.csharp",
	"redhat.vscode-yaml",
	"zainchen.json" // this probably can be removed
],

Debugging experience

Apparently debugging AWS Lambda is slightly different from Azure functions in a sense that it’s not intended for invocation from a browser but rather accepts an event via built-in dispatcher. We could potentially spend more time on it and get it to work with browsers but that looked good enough for the first stab.

Building up the winning sequence

With all of the above in mind we ended up with roughly the following sequence to get debugging to work:

  1. started with modified Docker-in-Docker template and added all tools
  2. opened the container up and used AWS extension to generate lambda skeleton app (after a couple of failed attempts we settled on dotnetcore3.1 (image) template)
  3. we then let OmniSharp run, pick up all C# projects and restore packages
  4. after that we rebuilt container to reinitialise extensions and make sure we’re starting off afresh
  5. Once we reopened the container, we use AWS extension again to generate launch configuration (it is important to let SAM know what version of dotnet we’re going to need. check out launch.json to verify)
  6. And finally, we run it

Action!

As always, code is in Github.

Azure Static Web Apps – adding PR support to Azure DevOps pipeline

Last time we took a peek under the hood of Static Web Apps, we discovered a docker container that allowed us to do custom deployments. This however left us with an issue where we could create staging environments but could not quite call it a day as we could not cleanup after ourselves.

There is more to custom deployments

Further inspection of GitHub actions config revealed there’s one more action that we could potentially exploit to get full advantage of custom workflows. It is called “close”:

name: Azure Static Web Apps CI/CD
....
jobs:
  close_pull_request_job:
    ... bunch of conditions here
    action: "close" # that is our hint!

With the above in mind, we can make an educated guess on how to invoke it with docker:

docker run -it --rm \
   -e INPUT_AZURE_STATIC_WEB_APPS_API_TOKEN=<your deployment token> \
   -e DEPLOYMENT_PROVIDER=DevOps \
   -e GITHUB_WORKSPACE="/working_dir" \
   -e IS_PULL_REQUEST=true \
   -e BRANCH="TEST_BRANCH" \
   -e ENVIRONMENT_NAME="TESTENV" \
   -e PULL_REQUEST_TITLE="PR-TITLE" \
   mcr.microsoft.com/appsvc/staticappsclient:stable \
   ./bin/staticsites/StaticSitesClient close --verbose

Running this indeed closes off an environment. That’s it!

Can we build an ADO pipeline though?

Just running docker containers is not really that useful as these actions are intended for CI/CD pipelines. Unfortunately, there’s no single config file we can edit to achieve it with Azure DevOps: we’d have to take a bit more hands on approach. Roughly the solution looks like so:

First, we’ll create a branch policy to kick off deployment to staging environment. Then we’ll use Service Hook to trigger an Azure Function on successful PR merge. Finally, stock standard Static Web Apps task will run on master branch when new commit gets pushed.

Branch policy

Creating branch policy itself is very straightforward: first we’ll need a separate pipeline definition:

pr:
  - master

pool:
  vmImage: ubuntu-latest

steps:
  - checkout: self    
  - bash: |
      docker run \
      --rm \
      -e INPUT_AZURE_STATIC_WEB_APPS_API_TOKEN=$(deployment_token)  \
      -e DEPLOYMENT_PROVIDER=DevOps \
      -e GITHUB_WORKSPACE="/working_dir" \
      -e IS_PULL_REQUEST=true \
      -e BRANCH=$(System.PullRequest.SourceBranch) \
      -e ENVIRONMENT_NAME="TESTENV" \
      -e PULL_REQUEST_TITLE="PR # $(System.PullRequest.PullRequestId)" \
      -e INPUT_APP_LOCATION="." \
      -e INPUT_API_LOCATION="./api" \
      -v ${PWD}:/working_dir \
      mcr.microsoft.com/appsvc/staticappsclient:stable \
      ./bin/staticsites/StaticSitesClient upload

In here we use a PR trigger, along with some variables to push through to Azure Static Web Apps. Apart from that, it’s a simple docker run that we have already had success with. To hook it up, we need a Build Validation check that would trigger this pipeline:

Teardown pipeline definition

Second part is a bit more complicated and requires an Azure Function to pull off. Let’s start by defining a pipeline that our function will run:

trigger: none

pool:
  vmImage: ubuntu-latest

steps:
  - script: |
      docker run --rm \
      -e INPUT_AZURE_STATIC_WEB_APPS_API_TOKEN=$(deployment_token) \
      -e DEPLOYMENT_PROVIDER=DevOps \
      -e GITHUB_WORKSPACE="/working_dir" \
      -e IS_PULL_REQUEST=true \
      -e BRANCH=$(PullRequest_SourceBranch) \
      -e ENVIRONMENT_NAME="TESTENV" \
      -e PULL_REQUEST_TITLE="PR # $(PullRequest_PullRequestId)" \
      mcr.microsoft.com/appsvc/staticappsclient:stable \
      ./bin/staticsites/StaticSitesClient close --verbose
    displayName: 'Cleanup staging environment'

One thing to note here is manual trigger – we opt out of CI/CD. Then, we make note of environment variables that our function will have to populate.

Azure Function

It really doesn’t matter what sort of function we create. In this case we opt for C# code that we can author straight from the Portal for simplicity. We also need to generate a PAT so our function can call ADO.

#r "Newtonsoft.Json"

using System.Net;
using System.Net.Http.Headers;
using System.Text;
using Microsoft.AspNetCore.Mvc;
using Microsoft.Extensions.Primitives;
using Newtonsoft.Json;

private const string personalaccesstoken = "<your PAT>";
private const string organization = "<your org>";
private const string project = "<your project>";
private const int pipelineId = <your pipeline Id>; 

public static async Task<IActionResult> Run([FromBody]HttpRequest req, ILogger log)
{
    log.LogInformation("C# HTTP trigger function processed a request.");
    string requestBody = await new StreamReader(req.Body).ReadToEndAsync();
    dynamic data = JsonConvert.DeserializeObject(requestBody);	

    log.LogInformation($"eventType: {data?.eventType}");
    log.LogInformation($"message text: {data?.message?.text}");
    log.LogInformation($"pullRequestId: {data?.resource?.pullRequestId}");
    log.LogInformation($"sourceRefName: {data?.resource?.sourceRefName}");

    try
	{
		using (HttpClient client = new HttpClient())
		{
			client.DefaultRequestHeaders.Accept.Add(new System.Net.Http.Headers.MediaTypeWithQualityHeaderValue("application/json"));
			client.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Basic", ToBase64(personalaccesstoken));

			string payload = @"{ 
		""variables"": {
			""System.PullRequest.SourceBranch"": {
				""isSecret"": false,
            	""value"": """ + data?.resource?.sourceRefName + @"""
			},
			""System.PullRequest.PullRequestId"": {
				""isSecret"": false,
            	""value"": "+ data?.resource?.pullRequestId + @"
			}
		}
	}";
            var url = $"https://dev.azure.com/{organization}/{project}/_apis/pipelines/{pipelineId}/runs?api-version=6.0-preview.1";
            log.LogInformation($"sending payload: {payload}");
            log.LogInformation($"api url: {url}");
			using (HttpResponseMessage response = await client.PostAsync(url, new StringContent(payload, Encoding.UTF8, "application/json")))
			{
				response.EnsureSuccessStatusCode();
				string responseBody = await response.Content.ReadAsStringAsync();
                return new OkObjectResult(responseBody);
			}
		}
	}
	catch (Exception ex)
	{
		log.LogError("Error running pipeline", ex.Message);
        return new JsonResult(ex) { StatusCode = 500 }; 
	}
}

private static string ToBase64(string input)
{
	return Convert.ToBase64String(System.Text.ASCIIEncoding.ASCII.GetBytes(string.Format("{0}:{1}", "", input)));
}

Service Hook

With all prep work done, all we have left to do is to connect PR merge event to Function call:

The function url should contain access key if that was defined. The easiest is probably to copy it straight from the Portal’s Code + Test blade:

It also may be a good idea to test connection on the second form before finishing up.

Conclusion

Once everything is connected, the pipelines should create/delete staging environments similar to what GitHub does. One possible improvement we could potentially do, would be to replace branch policy with yet another Service Hook to Function so that PR title gets correctly reflected on the Portal.

But I’ll leave it as a challenge for readers to complete.