Azure Batch – Unusable nodes after Starting for a long time when using certificates with Azure Key Vault

Azure Infrastructure

Azure Batch account certificates retirement

The Migrate Batch account certificates to Azure Key Vault – Azure Batch | Microsoft Learn states that the Azure Batch account certificates feature will be retired on February 29, 2024. It provides links to an alternative an FAQ. However, the alternative doesn’t quite work.

In Enable automatic certificate rotation in a Batch pool – Azure Batch | Microsoft Learn, the article walks step by step creating everything needed. At a high level, here’s what’s needed.

  1. Create a user-assigned identity
  2. Create a certificate
  3. Add an access policy in Azure Key Vault – Actually, you should not use access policies and instead use Azure Key Vault RBAC roles.
  4. Create a Batch pool with a user-assigned managed identity – There is a good example provided.
  5. Next Steps – There’s a link to Use extensions with Batch pools – Azure Batch | Microsoft Learn. This has a bad example. EDIT 2023-11-27: This is fixed since Update create-pool-extensions.md example to use Azure Linux by wahidsaleemi · Pull Request #117207 · MicrosoftDocs/azure-docs (github.com) was merged.

If you follow the example in the article from #5 above, it will result in unusable nodes:

Unusable node

Solution to unusable nodes

In the first article, there is a link to Azure Key Vault VM Extension for Linux – Azure Virtual Machines | Microsoft Learn. There’s an important section:

The Key Vault VM extension support these Linux distributions:

  • Ubuntu 20.04, 22.04
  • Azure Linux

I tested the available offers and Alma Linux, OpenLogic (CentOS), Microsoft Azure Batch (CentOS Container) all result in unusable nodes. Any offer using Ubuntu, Azure Linux (Mariner) and of course Microsoft Windows will work. I really hope this helps others out there!

0 comments… add one

Leave a Reply