Typesense and Open Source alternatives to Elastic Search

We’ve got many search engines. They come in many shapes and sizes: open source, commercial, self-hosted, SaaS… Some are built from ground up, while others hail from lucene. We’re mostly interested in open-source options that we can use to enhance user experience for our customers.

ElasticSearch vs OpenSearch

Historically we’ve been sticking with lucene-based engines. Our first commercial deployment involved Solr, then we moved on to recommending ElasticSearch. It was good when customers did not require paid-for features like security and authentication (and most medium-to-large enterprises would absolutely need that). This licensing model, however, posed a bit of a concern and on top of that in early 2021 Elastic departed from using Apache 2.0 licence for their core offerings. To remain vendor-neutral, we started considering alternatives, and TypeSense popped up. We were also quite curious to know how OpenSearch fares.

Let’s run some benchmarks, shall we?

Knowing that OpenSearch is in essence a fork of Elastic, we don’t expect a lot of differences, therefore we’ll just use one to demonstrate our point. With that in mind we’d use docker to spin up three services at the same time:

version: '3'
services:
    typesense:
        image: typesense/typesense:0.21.0
        volumes:
            - $PWD/typesense_data:/data
        ports:
            - "8108:8108"
        command: "--data-dir /data --api-key=123123"
    opensearch:
        image: opensearchproject/opensearch
        ports:
            - "9200:9200"
        environment:
            - discovery.type=single-node
        volumes:
            - $PWD/opensearch_data:/usr/share/opensearch/data
    solr:
        image: bitnami/solr # official image seems to have filesystem permissions issue when run under WSL, so we'll stick to best alternative
        ports:
            - "8983:8983"
        volumes:
            - $PWD/solr_data:/bitnami
        environment: # part of reason we pick Bitnami is convenience - these would precreate a core and collection for us on first run
            - SOLR_CORE=my_core
            - SOLR_COLLECTION=comments

First, let’s look at RAM consumption straight after containers spun up:

user@host:/mnt/repos/typesense$ docker ps -q | xargs  docker stats --no-stream
CONTAINER ID   NAME                     CPU %     MEM USAGE / LIMIT     MEM %     NET I/O          BLOCK I/O   PIDS
d9fdcbd58b0f   typesense_opensearch_1   0.55%     1.454GiB / 50.16GiB   2.90%     906B / 0B        0B / 0B     82
b8a67a4baf6c   typesense_typesense_1    0.14%     182.2MiB / 50.16GiB   0.35%     726B / 0B        0B / 0B     165
c623b9bd9225   typesense_solr_1         0.55%     710.4MiB / 50.16GiB   1.38%     726B / 0B        0B / 0B     55

The results are unsurprising: both Solr and Opensearch are written in Java and tend to treat memory with a bit more liberty. Typesense on the other hand is written in C++ and that seems to be helping. But we didn’t start all three up to simply marvel at RAM consumption, did we?

Indexing documents

Pretty much all search engines work in phases: first, we build an index of documents we want and the let users run queries against it. Indexing performance may not be the biggest concern. But since we must do it anyway, we may as well benchmark it and check out memory consumption after we’re done.

For the purposes of this demo, let us index comments found on StackOverflow circa 2010. We’ll use a simple C# application and common client libraries to do this. This way we’ll also be able to make a mental note of developer experience (even though it’s highly subjective and way harder to quantify).

Our setup

The source data we’re going to go through looks pretty simple:

CREATE TABLE [dbo].[Comments](
	[Id] [int] IDENTITY(1,1) NOT NULL,
	[CreationDate] [datetime] NOT NULL,
	[PostId] [int] NOT NULL,
	[Score] [int] NULL,
	[Text] [nvarchar](700) NOT NULL,
	[UserId] [int] NULL,
 CONSTRAINT [PK_Comments__Id] PRIMARY KEY CLUSTERED ([Id] ASC))

core of our test bench is also pretty simple (we’ll use LINQPad to further reduce noise):

public abstract class SearchEngineClient
{
	public abstract Task IndexAsync(IEnumerable<Comments> obj);
	public virtual async Task CommitAsync() {/* do nothing by default */}
}

async Task Main()
{
	var clients = new List<SearchEngineClient> {
		new SolrClient(),
		new OpenSearchClient(),
		new TypeSenseClient(),
	};

	foreach (var client in clients)
	{
		var clientName = client.GetType().Name;
		Console.WriteLine($"Testing {clientName}");
		Stopwatch sw = new Stopwatch();
		sw.Start();
		var input = Comments.Take(1000000);
		var pageSize = 10000;
		int pages = input.Count() / pageSize;
		int rem = input.Count() % pageSize;
		Console.Write($"Processed page.."); 
		for (int i = 0; i < pages; i++)
		{
			await client.IndexAsync(input.Skip(i * pageSize).Take(pageSize));			
			Console.Write($"{i}/{pages}..");
		}
		if (rem > 0) await client.IndexAsync(input.Skip(pages * pageSize).Take(rem));
		Console.WriteLine();
		
		await client.CommitAsync();
		sw.Stop();
		Console.WriteLine($"Finished {clientName} - {sw.ElapsedMilliseconds} ms");
	}
}

class TypeSenseClient : SearchEngineClient
{
	private class CommentsDto
	{
		public string Id { get; set; }
		public string Text { get; set; }
	};

	private readonly ITypesenseClient typesenseClient;
	public TypeSenseClient()
	{
		var config = new Microsoft.Extensions.Options.OptionsWrapper<Config>(new Config() {
			ApiKey = "123123",
			Nodes = new List<Node> { new Node { Host = "localhost", Port = "8108", Protocol = "http" } }
		});
		var httpClient = new HttpClient() {Timeout = TimeSpan.FromSeconds(600)};

		typesenseClient = new TypesenseClient(config, httpClient);
		if (typesenseClient.RetrieveCollection("Comments").Result != null)
		{
			var _ = typesenseClient.DeleteCollection("Comments").Result;
		}

		var schema = new Schema
		{
			Name = "Comments",
			Fields = new List<Typesense.Field>
							{
								new Typesense.Field("id", Typesense.FieldType.String, false),
								new Typesense.Field("text", Typesense.FieldType.String, false),
							},
		};

		var createCollectionResult = typesenseClient.CreateCollection(schema).Result;
	}

	public override async Task IndexAsync(IEnumerable<Comments> obj) => await typesenseClient.ImportDocuments<Comments>("Comments", obj.ToList(), obj.Count());
}

class SolrClient : SearchEngineClient
{
	private ISolrOperations<Comments> _solr;
	public SolrClient()
	{
		Startup.Init<Comments>("http://localhost:8983/solr/my_core");
		_solr = ServiceLocator.Current.GetInstance<ISolrOperations<Comments>>();		
		_solr.Delete(new SolrQuery("*:*"));
		_solr.Commit();
	}
	
	public override async Task IndexAsync(IEnumerable<Comments> obj) => _solr.AddRange(obj);

	public override async Task CommitAsync() => _solr.Commit();
}

class OpenSearchClient : SearchEngineClient
{
	private ElasticClient _client;
	public OpenSearchClient()
	{
		var node = new Uri("http://localhost:9200");
		var settings = new ConnectionSettings(node);
		_client = new ElasticClient(settings);
		if(_client.Indices.Exists(Indices.Index("comments")).Exists) _client.Indices.Delete(Indices.Index("comments"));
	}
	
	public override async Task IndexAsync(IEnumerable<Comments> obj) => await _client.IndexManyAsync(obj, "comments");
}

TypeSense Client

We picked up Typesense NuGet package and had no issues connecting to our test container. One thing that we immediately picked up was how documentation seems to favour the use of ServiceCollection and promote DI practices in general. It is a good thing for bigger projects, but we ended up taking an easier path.

Field names appear to be case sensitive, so we had to remember that most JSON serialisers would apply naming conventions and translate PascalCase to camelCase which is not a huge deal but is definitely something that tripped us up.

On the positive side, the library comes with Async methods all the way through. It kind of forces you into writing async code which should help performance.

Solr Client

Next up we had good ol’ Solr. And SolrNet is an obvious choice here. It was a while since we ran it in Docker and apparently filesystem permissions can be tricky to get right. So, we ended up going with alternative image from Bitnami.

Another gotcha with Solr – client library would not commit documents by itself, so we had to manually do it afterwards. It also comes with DI framework and forces us to use ServiceLocator pattern. A bit dated, but we’re used to it.

Opensearch Client

Elastic has not made it easy for us. They roll with two client libraries: Elastic.Net and NEST. Former being low level with minimal overhead while NEST is slightly more human-friendly. We ended up picking the latter.

Indexing results

After indexing 1Mil comments with all three engines, we arrived at somewhat surprising results:

SolrOpenSearchTypesense
RAM on startup710.4 MiB1.453 GiB182.3 MiB
Indexing time56977Β ms51813Β ms81396Β ms
RAM after indexing808.2 MiB1.768 GiB257.8 MiB

OpenSearch seems to be doing slightly better than the competition while Typesense seems to take a bit longer. On the other hand, Typesense maintained solid lead in memory consumption department. And this might come in handy when we move on to the next phase of our testing.

Conclusion

And while these results don’t stack up in Typesense’s favour, the difference is not stark. To draw any conclusions, we would need to run more queries and get a feel for developer experience. After all, value proposition of any search engine is, well, search. Indexing speed is nice to have but still comes secondary to actual search performance.

Watch this space!

Azure Functions – OpenAPI + EF Core = πŸ’₯

Creating Swagger-enabled Azure Functions is not that hard to do. Visual Studio literally comes with a template for that:

Inspecting the newly created project we see that it comes down to one NuGet package. It magically hooks into IWebJobsStartup and registers additional routes for Swagger UI and OpenAPI document. When run, it reflects upon suitable entry points in the assembly and builds required responses on the fly. Elegant indeed.

Installing Entity Framework

Now, suppose, we need to talk to Azure SQL. So, we’d like to add EF Core to the mix. As much as we love to go for latest and greatest, unfortunately it’s a bit messy at the moment. Instead let’s get a bit more conservative and stick to EFCore 3.1.

We did not expect that, did we?

The error message is pretty clear: the assembly somehow did not get copied to the output location. And indeed, the file was missing:

Apparently when VS builds the function, it makes a second copy of the libraries it thinks are required. And in our case, it decided it’s not picking up the dependency. Adding <_FunctionsSkipCleanOutput>true</_FunctionsSkipCleanOutput> to the project file will fix that:

Are there yet?

Probably, but there’s a catch: our deployment package just got bigger. Alternatively, we could downgrade EF Core to 3.1.13 which happens to use the same version of Microsoft.Extensions.Logging.Abstractions. This way we’d avoid having to hack project files at expense or limiting ourselves to an older version of EF Core. Ultimately, we hope OpenAPI extension picks up the slack and goes GA soon. For now, looks like we’ll have to stick to it.

Web API – Dev environment in 120 seconds

Quite a few recent engagements saw us developing APIs for clients. Setting these projects up is a lot of fun at first. After a few deployments, however, we felt there should be a way to optimise our workflow and bootstrap environments a bit quicker.

We wanted to craft a skeleton project that would provide structure and repeatability. After quick validation we decided that Weather Forecast project is probably good enough as API starting point. With that part out of the way we also needed to have a client application that we could use while developing and handing the API over to the client.

Our constraints

Given the purpose of our template we also had a few more limitations:

  • Clean desk policy – the only required tools are Docker and VS Code (and lots or RAM! but that would be another day’s problem). Everything else should be transient and should leave no residue on host system.
  • Offline friendly – demos and handovers can happen on-site, where we won’t necessarily have access to corporate WiFi or wired network
  • Open Source – not a constraint per se, but very nice to have

With that in mind, our first candidate Postman was out, and after scratching heads for a little while we stumbled upon Hoppscotch. A “light-weight, web-based API development suite” as it says on the tin, it seems to deliver most of the features we’d use.

Setting up with Docker

There are heaps of examples on how to use Hoppscotch, but we haven’t seen a lot regarding self-hosting. Probably, because it’s fairly straightforward to get started:

docker run -it --rm -p3000 hoppscotch/hoppscotch

After that we should be able to just visit localhost:3000 and see the sleek UI:

hoppscotch UI

Building containers

Before we get too far ahead, let’s codify the bits we already know. We start VS Code and browse through a catalog of available dev containers… This time round we needed to set up and orchestrate at least two: we’ve got our app as well as Hoppscotch sitting in the same virtual network. That led us to opt for docker-from-docker-compose container template. On top of that, we enhanced it with dotnet SDK installation like our AWS Lambda container.

Finally, the docker-compose.yml needs Hoppscotch service definition at the bottom:

  hoppscotch:
    image: hoppscotch/hoppscotch
    ports:
      - 3000:3000

Should be smooth sailing from here: reopen in container, create a web API project and test away! Right?

A few quirks to keep in mind

As soon as we fired up the UI and tried making simple requests, we realised that Hoppscotch is not immune to CORS restrictions. Developers offer a couple of ways to fix this:

  1. enable CORS in the API itself – that’s what we ended up doing for now
  2. set up browser extension, but we couldn’t go that route as it would moot our clean desk policy. It also it not yet available in Microsoft Edge extension store
  3. finally, we can use proxyscotch but that looked like a rabbit hole we may want to explore later.

Authentication mechanism support is hopefully coming, so we’ll watch that space.

There’s one more interesting behaviour that caught us off guard: the client would silently fail SSL certificate check until we manually trusted the host in another tab. There are other more technical solutions but the easiest for now is to avoid SSL in development.

Hoppscotch ssl trust error

Conclusion

Once again, we used our weapon of choice and produced an artifact that enables us to develop and test containerised APIs faster!

end to end setup flow

AWS Lambda Dev – environment in 120 seconds

Okay, despite roaring success we had with the previous attempt at this, setting up VS Code dev containers for AWS SAM proved to be quite a bit of a pain. And we’re still not sure if it’s worth it. But it was interesting to set up and may be useful in some circumstances, so here we go.

Some issues we ran into

The biggest issue by far was the fact that SAM heavily relies on containers which for us means we’ll have to go deeper and use docker-in-docker dev container as a starting point. The base image there comes with bare minimum software and dotnet SDK is not part of it. So, we’ll have to install everything ourselves:

#!/usr/bin/env bash

set -e

if [ "$(id -u)" -ne 0 ]; then
    echo -e 'Script must be run as root. Use sudo, su, or add "USER root" to your Dockerfile before running this script.'
    exit 1
fi

curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install
rm -rf ./aws
rm ./awscliv2.zip
echo "AWS CLI version `aws --version`"

curl -L "https://github.com/aws/aws-sam-cli/releases/latest/download/aws-sam-cli-linux-x86_64.zip" -o "aws-sam-cli-linux-x86_64.zip"
unzip aws-sam-cli-linux-x86_64.zip -d sam-installation
sudo ./sam-installation/install
echo "SAM version `sam --version`"
rm -rf ./sam-installation
rm ./aws-sam-cli-linux-x86_64.zip

wget https://packages.microsoft.com/config/debian/11/packages-microsoft-prod.deb -O packages-microsoft-prod.deb
sudo dpkg -i packages-microsoft-prod.deb
rm packages-microsoft-prod.deb
sudo apt-get update; \
  sudo apt-get install -y apt-transport-https && \
  sudo apt-get update && \
  sudo apt-get install -y dotnet-sdk-3.1

# Installing lambda tools was required to get lambda to work while I was testing different approaches. It may have become redundant after so many iterations and changes to the script, but probably does not hurt
dotnet tool install -g Amazon.Lambda.Tools
export PATH="$PATH:$HOME/.dotnet/tools"

This is fairly straightforward: install AWS CLI and SAM as described in the documentation, and then install dotnet SDK. All we need to do now, is call it from the main Dockerfile.

It also helps to pre-populate container with extensions we’re going to need anyway:

"extensions": [
	"ms-azuretools.vscode-docker",
	"amazonwebservices.aws-toolkit-vscode",
	"ms-dotnettools.csharp",
	"redhat.vscode-yaml",
	"zainchen.json" // this probably can be removed
],

Debugging experience

Apparently debugging AWS Lambda is slightly different from Azure functions in a sense that it’s not intended for invocation from a browser but rather accepts an event via built-in dispatcher. We could potentially spend more time on it and get it to work with browsers but that looked good enough for the first stab.

Building up the winning sequence

With all of the above in mind we ended up with roughly the following sequence to get debugging to work:

  1. started with modified Docker-in-Docker template and added all tools
  2. opened the container up and used AWS extension to generate lambda skeleton app (after a couple of failed attempts we settled on dotnetcore3.1 (image) template)
  3. we then let OmniSharp run, pick up all C# projects and restore packages
  4. after that we rebuilt container to reinitialise extensions and make sure we’re starting off afresh
  5. Once we reopened the container, we use AWS extension again to generate launch configuration (it is important to let SAM know what version of dotnet we’re going to need. check out launch.json to verify)
  6. And finally, we run it

Action!

As always, code is in Github.

Azure Static Web Apps – adding PR support to Azure DevOps pipeline

Last time we took a peek under the hood of Static Web Apps, we discovered a docker container that allowed us to do custom deployments. This however left us with an issue where we could create staging environments but could not quite call it a day as we could not cleanup after ourselves.

There is more to custom deployments

Further inspection of GitHub actions config revealed there’s one more action that we could potentially exploit to get full advantage of custom workflows. It is called “close”:

name: Azure Static Web Apps CI/CD
....
jobs:
  close_pull_request_job:
    ... bunch of conditions here
    action: "close" # that is our hint!

With the above in mind, we can make an educated guess on how to invoke it with docker:

docker run -it --rm \
   -e INPUT_AZURE_STATIC_WEB_APPS_API_TOKEN=<your deployment token> \
   -e DEPLOYMENT_PROVIDER=DevOps \
   -e GITHUB_WORKSPACE="/working_dir" \
   -e IS_PULL_REQUEST=true \
   -e BRANCH="TEST_BRANCH" \
   -e ENVIRONMENT_NAME="TESTENV" \
   -e PULL_REQUEST_TITLE="PR-TITLE" \
   mcr.microsoft.com/appsvc/staticappsclient:stable \
   ./bin/staticsites/StaticSitesClient close --verbose

Running this indeed closes off an environment. That’s it!

Can we build an ADO pipeline though?

Just running docker containers is not really that useful as these actions are intended for CI/CD pipelines. Unfortunately, there’s no single config file we can edit to achieve it with Azure DevOps: we’d have to take a bit more hands on approach. Roughly the solution looks like so:

First, we’ll create a branch policy to kick off deployment to staging environment. Then we’ll use Service Hook to trigger an Azure Function on successful PR merge. Finally, stock standard Static Web Apps task will run on master branch when new commit gets pushed.

Branch policy

Creating branch policy itself is very straightforward: first we’ll need a separate pipeline definition:

pr:
  - master

pool:
  vmImage: ubuntu-latest

steps:
  - checkout: self    
  - bash: |
      docker run \
      --rm \
      -e INPUT_AZURE_STATIC_WEB_APPS_API_TOKEN=$(deployment_token)  \
      -e DEPLOYMENT_PROVIDER=DevOps \
      -e GITHUB_WORKSPACE="/working_dir" \
      -e IS_PULL_REQUEST=true \
      -e BRANCH=$(System.PullRequest.SourceBranch) \
      -e ENVIRONMENT_NAME="TESTENV" \
      -e PULL_REQUEST_TITLE="PR # $(System.PullRequest.PullRequestId)" \
      -e INPUT_APP_LOCATION="." \
      -e INPUT_API_LOCATION="./api" \
      -v ${PWD}:/working_dir \
      mcr.microsoft.com/appsvc/staticappsclient:stable \
      ./bin/staticsites/StaticSitesClient upload

In here we use a PR trigger, along with some variables to push through to Azure Static Web Apps. Apart from that, it’s a simple docker run that we have already had success with. To hook it up, we need a Build Validation check that would trigger this pipeline:

Teardown pipeline definition

Second part is a bit more complicated and requires an Azure Function to pull off. Let’s start by defining a pipeline that our function will run:

trigger: none

pool:
  vmImage: ubuntu-latest

steps:
  - script: |
      docker run --rm \
      -e INPUT_AZURE_STATIC_WEB_APPS_API_TOKEN=$(deployment_token) \
      -e DEPLOYMENT_PROVIDER=DevOps \
      -e GITHUB_WORKSPACE="/working_dir" \
      -e IS_PULL_REQUEST=true \
      -e BRANCH=$(PullRequest_SourceBranch) \
      -e ENVIRONMENT_NAME="TESTENV" \
      -e PULL_REQUEST_TITLE="PR # $(PullRequest_PullRequestId)" \
      mcr.microsoft.com/appsvc/staticappsclient:stable \
      ./bin/staticsites/StaticSitesClient close --verbose
    displayName: 'Cleanup staging environment'

One thing to note here is manual trigger – we opt out of CI/CD. Then, we make note of environment variables that our function will have to populate.

Azure Function

It really doesn’t matter what sort of function we create. In this case we opt for C# code that we can author straight from the Portal for simplicity. We also need to generate a PAT so our function can call ADO.

#r "Newtonsoft.Json"

using System.Net;
using System.Net.Http.Headers;
using System.Text;
using Microsoft.AspNetCore.Mvc;
using Microsoft.Extensions.Primitives;
using Newtonsoft.Json;

private const string personalaccesstoken = "<your PAT>";
private const string organization = "<your org>";
private const string project = "<your project>";
private const int pipelineId = <your pipeline Id>; 

public static async Task<IActionResult> Run([FromBody]HttpRequest req, ILogger log)
{
    log.LogInformation("C# HTTP trigger function processed a request.");
    string requestBody = await new StreamReader(req.Body).ReadToEndAsync();
    dynamic data = JsonConvert.DeserializeObject(requestBody);	

    log.LogInformation($"eventType: {data?.eventType}");
    log.LogInformation($"message text: {data?.message?.text}");
    log.LogInformation($"pullRequestId: {data?.resource?.pullRequestId}");
    log.LogInformation($"sourceRefName: {data?.resource?.sourceRefName}");

    try
	{
		using (HttpClient client = new HttpClient())
		{
			client.DefaultRequestHeaders.Accept.Add(new System.Net.Http.Headers.MediaTypeWithQualityHeaderValue("application/json"));
			client.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Basic", ToBase64(personalaccesstoken));

			string payload = @"{ 
		""variables"": {
			""System.PullRequest.SourceBranch"": {
				""isSecret"": false,
            	""value"": """ + data?.resource?.sourceRefName + @"""
			},
			""System.PullRequest.PullRequestId"": {
				""isSecret"": false,
            	""value"": "+ data?.resource?.pullRequestId + @"
			}
		}
	}";
            var url = $"https://dev.azure.com/{organization}/{project}/_apis/pipelines/{pipelineId}/runs?api-version=6.0-preview.1";
            log.LogInformation($"sending payload: {payload}");
            log.LogInformation($"api url: {url}");
			using (HttpResponseMessage response = await client.PostAsync(url, new StringContent(payload, Encoding.UTF8, "application/json")))
			{
				response.EnsureSuccessStatusCode();
				string responseBody = await response.Content.ReadAsStringAsync();
                return new OkObjectResult(responseBody);
			}
		}
	}
	catch (Exception ex)
	{
		log.LogError("Error running pipeline", ex.Message);
        return new JsonResult(ex) { StatusCode = 500 }; 
	}
}

private static string ToBase64(string input)
{
	return Convert.ToBase64String(System.Text.ASCIIEncoding.ASCII.GetBytes(string.Format("{0}:{1}", "", input)));
}

Service Hook

With all prep work done, all we have left to do is to connect PR merge event to Function call:

The function url should contain access key if that was defined. The easiest is probably to copy it straight from the Portal’s Code + Test blade:

It also may be a good idea to test connection on the second form before finishing up.

Conclusion

Once everything is connected, the pipelines should create/delete staging environments similar to what GitHub does. One possible improvement we could potentially do, would be to replace branch policy with yet another Service Hook to Function so that PR title gets correctly reflected on the Portal.

But I’ll leave it as a challenge for readers to complete.

Things they don’t tell you – Tagging containers in Azure DevOps

Here’s an interesting gotcha that has kept us occupied for a little while. Our client wanted us to build an Azure DevOps pipeline that would build a container, tag it, and launch the image to do more work. As the result was not really worth pushing up to image registries, we decided to go fully local.

Setting up agent pool

Our client had further constraint that prevented them from using managed agents so first thing we had to do was to define a local pool. The process was uneventful, so we thought we’re off to a good start.

Creating Azure DevOps pipeline

Our first stab yielded a pipeline definition along the following lines:

trigger:
  - none

jobs:
  - job: test
    pool:
      name: local-linux-pool
    displayName: Build Cool software
    steps:
      - task: Bash@3
        displayName: Prune leftover containers
        inputs:
          targetType: inline
          script: |
            docker system prune -f -a

      - task: Docker@2
        displayName: Build worker container
        inputs:          
          command: build
          Dockerfile: 'container/Dockerfile'
          tags: |
            supercool/test-app # tagging container would simplify our next step and avoid us headaches of trying to figure out correct ID

      - bash: |
          docker run --name builderContainer -it --rm supercool/test-app:latest # we assume latest is the correct tag here

Nothing fancy here. We clean the environment before each run (this would be optional but helped troubleshooting). Then we build a container from a Dockerfile we found in source control. To make sure we run the right thing on next step we want to tag it.

But then it went sideways…

Unable to find image 'supercool/test-app:latest' locally
docker: Error response from daemon: pull access denied for test-app, repository does not exist or may require 'docker login': denied: requested access to the resource is denied.

This of course means docker could not detect a local image and went off to pull it from the default registry. And we don’t want that!

Upon further inspection we found command line that builds container DOES NOT tag it!

/usr/bin/docker build \
		-f /myagent/_work/1/s/container/Dockerfile \
		--label com.azure.dev.image.system.teamfoundationcollectionuri=https://dev.azure.com/thisorgdoesnotexist/ \
		--label com.azure.dev.image.system.teamproject=test \
		--label com.azure.dev.image.build.repository.name=test \
		--label com.azure.dev.image.build.sourceversion=3d7a6ed84b0e9538b1b29206fd1651b44c0f74d8 \
		--label com.azure.dev.image.build.repository.uri=https://thisorgdoesnotexist@dev.azure.com/thisorgdoesnotexist/test/_git/test \
		--label com.azure.dev.image.build.sourcebranchname=main \
		--label com.azure.dev.image.build.definitionname=test \
		--label com.azure.dev.image.build.buildnumber=20262388.5 \
		--label com.azure.dev.image.build.builduri=vstfs:///Build/Build/11 \
		--label image.base.ref.name=alpine:3.12 \
		--label image.base.digest=sha256:a296b4c6f6ee2b88f095b61e95c7dde451ba25598835b4978c9256d8c8ace48a \
		/myagent/_work/1/s/container

How does this happen?

Luckily, DockerV2 is Open Source and freely available on GitHub. Looking at the code, we notice an interesting error message: "NotAddingAnyTagsToBuild": "Not adding any tags to the built image as no repository is specified." Is that our clue? Seems like it may be. Let’s keep digging.

Further inspection reveals for tags to get applied, task must be able to infer image name:

 if (imageNames && imageNames.length > 0) {
        ....
        tagArguments.push(imageName);
        ....
    }
    else {
        tl.debug(tl.loc('NotAddingAnyTagsToBuild'));
    }

And that information must come from a repository input parameter.

Repository – (Optional) Name of repository within the container registry corresponding to the Docker registry service connection specified as input for containerRegistry

Looking at the documentation, this makes no sense

A further peek into the source code, however, reveals that developers have kindly thought about local tagging:

public getQualifiedImageNamesFromConfig(repository: string, enforceDockerNamingConvention?: boolean) {
        let imageNames: string[] = [];
        if (repository) {
            let regUrls = this.getRegistryUrlsFromDockerConfig();
            if (regUrls && regUrls.length > 0) {

                // not our case, skipping for brevity
            }
            else {
                // in case there is no login information found and a repository is specified, the intention
                // might be to tag the image to refer locally.
                let imageName = repository;
                if (enforceDockerNamingConvention) {
                    imageName = imageUtils.generateValidImageName(imageName);
                }
                
                imageNames.push(imageName);
            }
        }

        return imageNames;
    }

Now we can solve it

Adding repository input to our task without specifying containerRegistry should get us the desired result:

...
      - task: Docker@2
        displayName: Build worker container
        inputs:          
          command: build
          Dockerfile: 'container/Dockerfile'
          repository: supercool/test-app # moving this from tag input to repository does magic!
...

Looking at the logs we seem to have won:

/usr/bin/docker build \
		-f /myagent/_work/1/s/container/Dockerfile \
		#... ADO labels go here, irrelevant
		-t supercool/test-app \
		/myagent/_work/1/s/container

Conclusion

This scenario, however far-fetched it may appear, seems to be fully supported. At least on the code level. Documentation is lacking a little bit, but I understand how this nuance may be hard to convey in 1-2 paragraphs when there’s so much else to cover.

Azure Static Web Apps – Lazy Dev Environment

Playing with Static Web Apps is lots of fun. However, setting up a list of required libraries and tools can get a little bit daunting. On top of that, removing it will likely leave a messy residue.

Use VS Code Dev containers then

So, let us assume WSL and Docker are already installed (Microsoft should consider shipping these features pre-installed, really). Then we can quickly grab VS Code and spin up a development container.

Turns out, Microsoft have already provided a very good starting point. So, all we need to do is:

  1. start a blank workspace folder, hit F1
  2. type “Add Development Container” and select the menu item
  1. type something and click “Show All Definitions”
  1. Select “Azure Static Web Apps”
  1. Press F1 once more and run “Remote-Containers: Reopen Folder in Container”

At the very minimum

To be valid, Static Web Apps require an index.html file. Let’s assume we’ve got static frontend sorted. Now we also want to add an API:

vscode ➜ /workspaces/vs-dev-containers-demo $ mkdir api && cd api
vscode ➜ /workspaces/vs-dev-containers-demo/api $ func init
vscode ➜ /workspaces/vs-dev-containers-demo/api $ func new -l C# -t HttpTrigger -n HelloWorld

nothing fancy, but now we can start everything with swa start:

vscode ➜ /workspaces/vs-dev-containers-demo $ swa start --api api

VS Code would go ahead and download recommended extensions and language packs, so this should just work.

We want better dev experience

And this is where custom tasks and launch configurations would come in handy. We want VS Code to run swa emulator for us and attach to running instance of Functions:

{
  "version": "0.2.0",
  "compounds": [
    {
      "name": "Run Static Web App with API",
      "configurations": ["Attach to .NET Functions", "Run SWA emulator"],        
      "presentation": {
        "hidden": false,
        "group": "",
        "order": 1
      }
    }
  ],
  "configurations": [
    {
      "name": "Attach to .NET Functions",
      "type": "coreclr",
      "request": "attach",
      "processId": "${command:azureFunctions.pickProcess}",
      "presentation": {
        "hidden": true,
        "group": "",
        "order": 2
      }
    },
    {
      "name": "Run SWA emulator",
      "type": "node-terminal",
      "request": "launch",      
      "cwd": "${workspaceFolder}",
      "command": "swa start . --api http://localhost:7071",
      "serverReadyAction": {
        "pattern": "Azure Static Web Apps emulator started at http://localhost:([0-9]+)",
        "uriFormat": "http://localhost:%s",
        "action": "openExternally"
      },
      "presentation": {
        "hidden": true,
        "group": "",
        "order": 3
      }
    }
  ]  
}

Save this file under .vscode/launch.json, reopen project folder and hit F5 to enjoy effect!

Finally, all code should get committed to GitHub (refrain from using ADO if you can).

EF Core 6 – does our IMethodCallTranslator still work?

A year and a half ago we posted an article on how we were able to plug into EF Core pipeline and inject our own IMethodCallTranslator. That let us leverage SQL-native encryption functionality with EF Core 3.1 and was ultimately a win. A lot has changed in the ecosystem, we’ve got .NET 5 and and .NET 6 coming up soon. So, we could not help but wonder…

Will it work with EF Core 6?

Apparently, EF6 is mostly an evolutionary step over EF5. That said, we totally missed previous version. So it is unclear to what extent the EF team has reworked their internal APIs. Most of the extensibility points we used were internal and clearly marked as “not for public consumption”. With that in mind, our concerns seemed valid.

Turns out, the code needs change…

The first issue we needed to rectify was implementing ShouldUseSameServiceProvider: from what I can tell, it’s needed to cache services more efficiently, but in our case setting it to default value seems to make sense.

But that’s where things really went sideways:

Apparently adding our custom IDbContextOptionsExtension resets the cache and by the time EF arrives at Model initialisation, instance of DI container gets wiped, leaving us with a bunch of null references (including the one above).

One line fix

I am still unsure why EF so upset when we add new extension. Stepping through the code would likely provide me with the answer but I feel it’s not worth the effort. Playing around with service scopes I however noticed that many built-in services get registered using different extension method with Scoped lifecycle. This prompted me to try change my registration method signature and voila:

And as usual, fully functioning code sits on GitHub.

Azure Static Web Apps – custom build and deployments

Despite Microsoft claims “First-class GitHub and Azure DevOps integration” with Static Web Apps, one is significantly easier to use than the other. Let’s take a quick look at how much features we’re giving up by sticking to Azure DevOps:

GitHubADO
Build/Deploy pipelinesAutomatically adds pipeline definition to the repoRequires manual pipeline setup
Azure Portal supportβœ“βœ—
VS Code Extensionβœ“βœ—
Staging environments and Pull Requestsβœ“βœ—

Looks like a lot of functionality is missing. This however begs the question whether we can do something about it?

Turns out we can…sort of

Looking a bit further into ADO build pipeline, we notice that Microsoft has published this task on GitHub. Bingo!

The process seems to run a single script that in turn runs a docker image, something like this:

...
docker run \
    -e INPUT_AZURE_STATIC_WEB_APPS_API_TOKEN="$SWA_API_TOKEN" \
    ...
    -v "$mount_dir:$workspace" \
    mcr.microsoft.com/appsvc/staticappsclient:stable \
    ./bin/staticsites/StaticSitesClient upload

What exactly StaticSitesClient does is shrouded with mystery, but upon successful build (using Oryx) it creates two zip files: app.zip and api.zip. Then it uploads both to Blob storage and submits a request for ContentDistribution endpoint to pick the assets up.

It’s Docker – it runs anywhere

This image does not have to run at ADO or Github! We can indeed run this container locally and deploy without even committing the source code. All we need is a deployment token:

docker run -it --rm \
   -e INPUT_AZURE_STATIC_WEB_APPS_API_TOKEN=<your_deployment_token> 
   -e DEPLOYMENT_PROVIDER=DevOps \
   -e GITHUB_WORKSPACE="/working_dir"
   -e IS_PULL_REQUEST=true \
   -e BRANCH="TEST_BRANCH" \
   -e ENVIRONMENT_NAME="TESTENV" \
   -e PULL_REQUEST_TITLE="PR-TITLE" \
   -e INPUT_APP_LOCATION="." \
   -e INPUT_API_LOCATION="./api" \
   -v ${pwd}:/working_dir \
   mcr.microsoft.com/appsvc/staticappsclient:stable \
   ./bin/staticsites/StaticSitesClient upload

Also notice how this deployment created a staging environment:

Word of caution

Even though it seems like a pretty good little hack – this is not supported. The Portal would also bug out and refuse to display Environments correctly if the resource were created with “Other” workflow:

Portal
AZ CLI

Conclusion

Diving deep into Static Web Apps deployment is lots of fun. It may also help in situations where external source control is not available. For real production workloads, however, we’d recommend sticking with GitHub flow.

Setting up Basic Auth for Swashbuckle

Let us put the keywords up front: Swashbuckle Basic authentication setup that works for Swashbuckle.WebApi installed on WebAPI projects (.NET 4.x). It’s also much less common to need Basic auth nowadays where proper mechanisms are easily available in all shapes and sizes. However, it may just become useful for internal tools or existing integrations.

Why?

Most tutorials online would cover the next gen library aimed specifically at ASP.Net Core, which is great (we’re all in for tech advancement), but sometimes legacy needs a prop up. If you are looking to add Basic Auth to ASP.NET Core project – look up AddSecurityDefinition and go from there. If legacy is indeed the in trouble – read on.

Setting it up

The prescribed solution is two steps:
1. add authentication scheme on SwaggerDocsConfig class
2. attach authentication to endpoints as required with IDocumentFilter or IOperationFilter

Assuming the installer left us with App_Start/SwaggerConfig.cs file, we’d arrive at:

public class SwaggerConfig
{
	public static void Register()
	{
		var thisAssembly = typeof(SwaggerConfig).Assembly;

		GlobalConfiguration.Configuration
			.EnableSwagger(c =>
				{
					...
					c.BasicAuth("basic").Description("Basic HTTP Authentication");
					c.OperationFilter<BasicAuthOpFilter>();
					...
				});
	}
}

and the most basic IOperationFilter would look like so:

public class BasicAuthOpFilter : IOperationFilter
{
	public void Apply(Operation operation, SchemaRegistry schemaRegistry, ApiDescription apiDescription)
	{
		operation.security = operation.security ?? new List<IDictionary<string, IEnumerable<string>>>();
		operation.security.Add(new Dictionary<string, IEnumerable<string>>()
							   {
								   { "basic", new List<string>() }
							   });
	}
}