Parsing Bicep string interpolation

This post discusses Workout, my new OSS project. If you’re interested, you can find it on GitHub - https://github.com/TheCloudTheory/Workout

String interpolation in Azure Bicep allows you to easily format strings, so you don’t need to use all those nasty functions and concatenations. In the simplest form, it could look like this:

param parAcrName string
var varAcrName = 'workout${parAcrName}'

If you’re familiar with ARM Templates, you probably remember, that we needed to use following syntax to achieve the same result:

{
    "foo": "[format('workout{0}', parameters('parAcrName'))]
}

It’s ugly and too verbose. From the user’s perspective, string interpolation is a great feature making Bicep much better choice for modern infrastructure automation. If you need to parse Bicep file though, things are no longer that easy.

Parsing string interpolation in Workout

Workout is an experimental DSL for writing Bicep tests. It allows you to use syntax very similar to Bicep and is tighly coupled with its semantic model. As Workout is meant to allow to test your infrastructure configuration, it needs to compile all the dynamic expressions incorporated into your Bicep files. Without that, it would unable to correctly assess all the assertions. One of the dynamic expressions could be string interpolation. In order to find a proper implementation, let’s create a Bicep file.

When working with string interpolation, you need to make the following assumptions:

  • string interpolation contains from 1 to N number of parameters
  • string interpolation may contain other dynamic expressions (like variables or parameters)

To reflect that, we could introduce the following Bicep file containing a single ACR definition:

param parAcrName string
param parIsPrivate bool = false
param parSuffix string

@maxLength(3)
param parEnvironment string = 'dev'

var varAcrName = 'workout${parAcrName}${parEnvironment}${parSuffix}${parIsPrivate ? 'private' : 'public'}'
var varAcrLocation = 'westeurope'
var varAdminUserEnabled = parIsPrivate ? false : true

resource acr 'Microsoft.ContainerRegistry/registries@2023-11-01-preview' = {
  name: varAcrName
  location: varAcrLocation
  sku: {
    name: 'Basic'
  }
  properties: {
    adminUserEnabled: varAdminUserEnabled
  }
}

Our main area of interest will the be following line:

var varAcrName = 'workout${parAcrName}${parEnvironment}${parSuffix}${parIsPrivate ? 'private' : 'public'}'

Now we need to write a Workout file.

Writing test cases with Workout

In our Workout file we’ll introduce three separate test cases:

import './interpolation-multiple-parameters.bicep'

@smoke
test testCase1 = {
    param(parAcrName, 'acr')
    param(parIsPrivate, true)
    param(parSuffix', 'xyz')

    equals(acr.name, 'workoutacrdevxyzprivate')
}

@smoke
test testCase2 = {
    param(parAcrName, 'acr')
    param(parIsPrivate, false)
    param(parSuffix', 'xyz')

    equals(acr.name, 'workoutacrdevxyzpublic')
}

@smoke
test testCase3 = {
    param(parAcrName, 'acr')
    param(parIsPrivate, false)
    param(parSuffix', 'xyz')
    param(parEnvironment, 'tst')

    equals(acr.name, 'workoutacrtstxyzpublic')
}

For simplicity, we’ll validate only the name of ACR generated by the Bicep. We have 3 different names to check:

  • workoutacrdevxyzprivate
  • workoutacrdevxyzpublic
  • workoutacrtstxyzpublic

Let’s see now how Workout handles such cases.

Writings tests

Workout currently has quite a primitive test suite, which runs all the test cases E2E. For our purpose, we could add one more test looking like this:

public class StringInterpolationTests
{
    [Test]
    public void StringInterpolation_WhenStringInterpolationConsistsOfMultipleParameteres_TheyMustBeParsedCorrectly()
    {
        var result = Program.Main(["start", "workout", "--working-directory", "../../../individual-workouts", "--file", "multiple-interpolations.workout", "--debug"]);

        Assert.That(result, Is.EqualTo(0));
    }
}

The test will run Workout CLI and check whether the exit code is equal to success (0).

Initial run

As expected, the initial run isn’t successful - it’s end with the following error:

Index (zero based) must be greater than or equal to zero and less than the size of the argument list.

The error is something I’d expect - as for now, Workout has supported only single parameter provided in the string interpolation. However, let’s dive deeper into the problem. Under the hood, our string interpolation is translated into the following expressions:

format('workout{0}{1}{2}{3}', parameters('parAcrName'), parameters('parEnvironment'), parameters('parSuffix'), if(parameters('parIsPrivate'), 'private', 'public')

This is exactly the same format() function you’d use in ARM Templates. As Bicep is fully compatible with ARM Templates, this shouldn’t be a surprise, that things look just like that. The challenge now is to somehow parse the format() function, so Workout can correctly evaluate all the assertions.

Parsing format() function

Workout parses all the dynamic expressions using regular expressions. They allow to quickly find matches and extract them from each parsed string. For the format() function, we have the following expression:

[GeneratedRegex(@"format\('.+', ?'.+'\)", RegexOptions.Compiled)]
private static partial Regex FormatRegex();

This works perfectly fine for various strings like:

format('foo{0}', 'bar')
format('workout{0}{1}{2}{3}', parameters('parAcrName'), parameters('parEnvironment'), parameters('parSuffix'), if(parameters('parIsPrivate'), 'private', 'public')

The error we’re facing is then no related to the regular expression, but rather the way how Workout tries to format a string with placeholders. As for now, the implementation for that particular mechanism looked like this:

var replacedValue = value.Replace(match.Value, string.Format(formatValue, formatArgs));

The problem we have is caused by the string.Format() function, which will take the following parameters as input:

var replacedValue = value.Replace(match.Value, string.Format("workout{0}{1}{2}{3}", "parameters(parAcrName)"));

The obvious issue here is the number of parameters passed into string.Format() function - instead of four parameters, we’re passign only one. We could quite easily make a fix where we’re passing as many parameters as we got from parsing the raw value for the format() function in our template, but there’s one more issue we need to fix. Dynamic expressions like format() may include other dynamic expressions (for instance if()) besides common parameter() or variable(). Those nested expressions may contain other nested expressions. We need to find a way to parse them in order.

Evaluation order for expressions in Workout

When evaluating expressions, Workout assumes the following order:

  • variables & parameters
  • other dynamic expressions

While the order is simple and easy to follow, expressions may contain other expressions. We already saw that in our example - there’s if() expression nested inside format() function:

format('workout{0}{1}{2}{3}', parameters('parAcrName'), parameters('parEnvironment'), parameters('parSuffix'), if(parameters('parIsPrivate'), 'private', 'public')

Such situation complicates things as we need not only to preserve the initial order of execution, but also make sure, that nested expressions are evaluated first. If we fail to do so (let’s say, we evaluate format() first), we’ll end up with the following value:

workoutacrtstxyzif(False)

This is obviously wrong, so we need to make expressions aware of their hierarchy.

Evaluating nested expressions

To evaluate nested expressions, each root expression needs to compile all the internal expressions first, then evaluate itself. Let’s come back to our example. We 4 different expressions to evaluate:

  • parameters('parAcrName')
  • parameters('parEnvironment')
  • parameters('parSuffix')
  • if(parameters('parIsPrivate'), 'private', 'public')

As you already now, the order of the execution implies, that first Workout evaluates parameters and then proceeds to evaluate other expressions. After first iteration, our example will change to the following value:

[[format('workout{0}{1}{2}{3}', 'acr', dev, 'xyz', if(True, 'private', 'public'))]]

We have all the values of the parameters - now we need to evaluate the if() expression, which will change the value closer to the final result:

format('workout{0}{1}{2}{3}', 'acr', dev, 'xyz', 'private'

Then, we can simply use native C# function to format the rest of the string:

var args = rawMatch.Replace("format(", string.Empty).TrimEnd(')').Split(",");
var formatValue = args[0].Replace("'", string.Empty).Trim();
var formatArgs = args.Skip(1).Select(_ => _.Replace("'", string.Empty).Trim()).ToArray();
var replacedValue = value.Replace(match.Value, string.Format(formatValue, formatArgs));

this.logger.LogDebug($"Replaced format function {match.Value} with value {replacedValue}.");
value = replacedValue;

That’s it! After running the test, it seems that all test cases pass:

...(lots of other debug messages)...

[2024-08-13 20:28:44][Debug] Parsed 3 tests from 
../../../individual-workouts/multiple-interpolations.workout.
[2024-08-13 20:28:44][Information] Found 3 tests.
[2024-08-13 20:28:44][Information] Running test: testCase1.
[2024-08-13 20:28:44][Debug] Running assertion: equals(acr.name, 
'workoutacrdevxyzprivate').
[2024-08-13 20:28:44][Debug] Comparing [workoutacrdevxyzprivate, 
workoutacrdevxyzprivate]; assertion evaluated to True.
[2024-08-13 20:28:44][Debug] Running assertion: equals(acr.name, 
'workoutacrdevxyzprivate') | Result: True.
[2024-08-13 20:28:44][Information] Test testCase1 passed.
[2024-08-13 20:28:44][Information] Running test: testCase2.
[2024-08-13 20:28:44][Debug] Running assertion: equals(acr.name, 
'workoutacrdevxyzpublic').
[2024-08-13 20:28:44][Debug] Comparing [workoutacrdevxyzpublic, 
workoutacrdevxyzpublic]; assertion evaluated to True.
[2024-08-13 20:28:44][Debug] Running assertion: equals(acr.name, 
'workoutacrdevxyzpublic') | Result: True.
[2024-08-13 20:28:44][Information] Test testCase2 passed.
[2024-08-13 20:28:44][Information] Running test: testCase3.
[2024-08-13 20:28:44][Debug] Running assertion: equals(acr.name, 
'workoutacrtstxyzpublic').
[2024-08-13 20:28:44][Debug] Comparing [workoutacrtstxyzpublic, 
workoutacrtstxyzpublic]; assertion evaluated to True.
[2024-08-13 20:28:44][Debug] Running assertion: equals(acr.name, 
'workoutacrtstxyzpublic') | Result: True.
[2024-08-13 20:28:44][Information] Test testCase3 passed.
[2024-08-13 20:28:44][Information] All tests passed.

Next steps

While the current method of parsing and evaluating expressions works for most of the cases, it looks quite cumbersome and prone to errors. A much better option would be to build a tree of expressions first and then evaluate them in correct order. This can be done using e.g. regular expressions, at least until Workout has a proper semantic model, which can parse the syntax.