31 minute read Platform Engineering

Network architecture is one of those foundational decisions that shapes everything you build on top of it. Get it right at the start, and you’ll have a scalable, secure platform that grows with your organisation. Get it wrong, and you’ll spend years fighting technical debt, security vulnerabilities, and operational complexity that compounds with every new workload you deploy.

For Azure-based platforms, the hub and spoke topology has emerged as the de facto standard for enterprise network design, and for good reason. It provides the perfect balance between isolation and connectivity, centralises security controls without creating bottlenecks, and scales elegantly as your organisation grows from a handful of applications to hundreds of workloads across multiple regions.

In this comprehensive guide, I’ll walk you through building a production-ready hub and spoke network topology in Azure using Terraform. We’ll explore the architectural principles that make this pattern so effective, implement the infrastructure with security best practices embedded at every layer, and address the operational considerations that separate toy examples from production-ready platforms.

Understanding the Hub and Spoke Pattern

Before we write a single line of Terraform, it’s worth understanding why this architectural pattern exists and what problems it solves. The hub and spoke topology isn’t just about drawing circles and lines on architecture diagrams, it’s a deliberate approach to managing complexity whilst maintaining security and operational efficiency.

The Problem with Flat Networks

In the early days of cloud adoption, many organisations built flat network topologies where every virtual network could communicate with every other virtual network through full-mesh peering. For three or four VNets, this works fine. By the time you reach ten VNets, you’re managing 45 peering connections. At twenty VNets, that number balloons to 190 peering connections.

Beyond the sheer management overhead, flat topologies create security challenges. Every VNet becomes a potential attack vector for every other VNet. Network security groups become impossibly complex as you try to maintain granular control over which applications can communicate. Audit trails become muddled as traffic flows directly between workloads without passing through central inspection points.

The Hub and Spoke Solution

The hub and spoke pattern solves these problems through centralisation. Instead of every VNet peering with every other VNet, spoke VNets peer only with a central hub VNet. The hub contains shared services; firewalls, VPN gateways, DNS servers, monitoring infrastructure, that all spokes can access. Traffic between spokes flows through the hub, where it can be inspected, logged, and controlled.

This topology dramatically reduces complexity. Twenty spoke VNets require only twenty peering connections to the hub, rather than 190 in a mesh topology. Security controls centralise in the hub, making them easier to maintain and audit. Network changes in one spoke don’t affect others unless you explicitly allow it.

The pattern also aligns beautifully with organisational structure. Each spoke can represent a different team, application, or environment. Development teams get their own isolated network space whilst platform engineering maintains central control over security policy, connectivity to on-premises networks, and shared infrastructure.

When to Use This Pattern

Hub and spoke topology makes sense when you have multiple workloads that need some degree of isolation from each other but also require shared services or connectivity to external networks. This describes most enterprise scenarios.

If you’re building a simple proof of concept with a single application in a single VNet, hub and spoke is overkill. But the moment you start thinking about multiple environments, multiple teams, or connections to on-premises networks, this pattern starts paying dividends. The initial investment in building the hub pays off quickly as you add spokes without increasing complexity.

Architectural Design Decisions

Every hub and spoke implementation requires making deliberate choices about how to structure the network, secure traffic, and handle connectivity. These decisions shape everything from IP address allocation to firewall rules to routing configuration.

IP Address Planning

IP address planning is one of those tasks that feels tedious until you get it wrong, at which point it becomes a nightmare. Once workloads are deployed and using IP addresses, changing them requires downtime, coordination, and significant risk.

The cardinal rule is to be generous with address space allocation. Azure supports RFC 1918 private address ranges (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16), and there’s no cost to reserving large blocks. I recommend allocating a /16 (65,536 addresses) for the hub and at least a /20 (4,096 addresses) for each spoke, with room to grow.

Within the hub, subdivide address space by function. Allocate a subnet for Azure Firewall, another for VPN Gateway, another for Azure Bastion, and so on. Each Azure service has specific subnet sizing requirements; Azure Firewall needs at least a /26, VPN Gateway needs at least a /27, Azure Bastion needs at least a /26. Factor in future growth when sizing these subnets.

For spoke VNets, organise subnets by tier or function. Web tier gets a subnet, application tier gets another, data tier gets a third. Keep subnet sizes consistent across spokes where possible—it makes automation easier and helps with capacity planning.

Document your IP address allocation scheme from the start. Future you, six months from now, will thank present you for writing down which address ranges are allocated to which environments and teams.

Hub VNet Design

The hub VNet serves as the central point of connectivity for your entire network topology. It needs to accommodate several distinct functions, each with its own subnet requirements.

Azure Firewall sits at the heart of the hub, inspecting all traffic between spokes and controlling outbound internet access. It requires a dedicated subnet named AzureFirewallSubnet with a minimum size of /26, though /24 gives you more room for scale.

If you’re connecting to on-premises networks via VPN or ExpressRoute, you’ll need a Gateway subnet. VPN Gateway and ExpressRoute Gateway both require a subnet named GatewaySubnet with a minimum size of /27, though /26 is recommended for production deployments.

Azure Bastion provides secure RDP and SSH access to virtual machines without exposing them to the public internet. It needs a subnet named AzureBastionSubnet with a minimum size of /26.

For shared services like jump boxes, monitoring infrastructure, or centralised DNS servers, create additional subnets sized appropriately for your needs. A /27 or /28 often suffices for these management workloads.

Spoke VNet Design

Spoke VNets contain your actual workloads; the applications, databases, and services that deliver value to your organisation. Design them with isolation and security in mind.

Each spoke should map to a clear organisational or technical boundary. You might create spokes per environment (development, staging, production), per team (platform, data, mobile), per application tier (frontend, backend, data), or some combination. The key is consistency; establish a pattern and stick to it.

Within each spoke, use subnets to create security boundaries. Don’t put web servers and databases in the same subnet. Subnet boundaries allow you to apply network security groups that control which traffic is allowed. A compromised web server in its own subnet can be prevented from accessing database servers in a different subnet.

Consider creating separate subnets for private endpoints if you’re using Azure PaaS services like Storage Accounts or SQL Databases with private connectivity. This makes it easier to manage DNS resolution and network security group rules for these endpoints.

Routing Architecture

In a hub and spoke topology, routing determines how traffic flows between spokes. By default, spoke-to-spoke traffic doesn’t work even with peering configured; Azure VNet peering is non-transitive. If Spoke A peers with the Hub, and Spoke B peers with the Hub, traffic from Spoke A cannot reach Spoke B without additional configuration.

This is actually a security feature. It means you explicitly control which spokes can communicate, rather than allowing all-to-all communication by default.

To enable spoke-to-spoke communication, you have two options. The simpler approach uses Azure Firewall as a router. Configure user-defined routes (UDRs) in each spoke that send traffic destined for other spoke address ranges to the Azure Firewall. The firewall inspects the traffic and forwards it to the destination spoke.

The alternative approach deploys a network virtual appliance (NVA) in the hub for routing, but for most scenarios, Azure Firewall provides sufficient routing capabilities alongside its security functions. Using a single service for both simplifies management and reduces cost.

Connectivity Options

Your hub and spoke topology likely needs connectivity beyond Azure. On-premises datacentres, remote offices, and mobile workers all need secure access to cloud resources.

For site-to-site connectivity to on-premises networks, you’ll choose between VPN Gateway and ExpressRoute. VPN Gateway provides encrypted IPsec tunnels over the public internet and supports throughput up to 10 Gbps with the VpnGw5 SKU. It’s cost-effective and suitable for most scenarios.

ExpressRoute provides private connectivity that doesn’t traverse the public internet, with dedicated bandwidth from 50 Mbps to 100 Gbps. It costs more but provides better performance, reliability, and security for mission-critical workloads. Many organisations use both; ExpressRoute for primary connectivity and VPN Gateway as a failover path.

For remote user access, Azure Bastion provides secure RDP and SSH connectivity without exposing virtual machines to the internet. It eliminates the need for jump boxes with public IP addresses and provides audit logging of all remote access sessions.

Visual Architecture

To help visualise how all these components fit together, here’s a diagram showing a typical hub and spoke topology with three spoke VNets:

graph TB
    subgraph Internet["Internet"]
        OnPrem["On-Premises Network
192.168.0.0/16"] Users["Remote Users"] end subgraph Hub["Hub VNet
10.0.0.0/16"] subgraph FirewallSubnet["AzureFirewallSubnet
10.0.0.0/24"] AFW["Azure Firewall
10.0.0.4"] end subgraph GatewaySubnet["GatewaySubnet
10.0.1.0/24"] VPN["VPN Gateway"] end subgraph BastionSubnet["AzureBastionSubnet
10.0.2.0/24"] Bastion["Azure Bastion"] end subgraph ManagementSubnet["Management Subnet
10.0.3.0/24"] Monitor["Monitoring
Infrastructure"] end end subgraph Spoke1["Production Spoke
10.1.0.0/16"] subgraph WebSubnet1["Web Subnet
10.1.1.0/24"] Web1["Web Servers"] end subgraph AppSubnet1["App Subnet
10.1.2.0/24"] App1["Application
Servers"] end subgraph DataSubnet1["Data Subnet
10.1.3.0/24"] DB1["Databases"] end RT1["Route Table"] end subgraph Spoke2["Development Spoke
10.2.0.0/16"] subgraph WorkloadSubnet2["Workload Subnet
10.2.1.0/24"] Workload2["Development
Workloads"] end RT2["Route Table"] end subgraph Spoke3["Shared Services Spoke
10.3.0.0/16"] subgraph ServicesSubnet3["Services Subnet
10.3.1.0/24"] DNS["DNS Servers"] AD["Active Directory"] end RT3["Route Table"] end OnPrem -.->|"VPN Tunnel
IPsec"| VPN Users -.->|"RDP/SSH"| Bastion VPN -->|"Peering"| AFW Bastion -.->|"Secure Access"| Web1 Bastion -.->|"Secure Access"| Workload2 AFW <-->|"VNet Peering
Allow Gateway Transit"| Spoke1 AFW <-->|"VNet Peering
Allow Gateway Transit"| Spoke2 AFW <-->|"VNet Peering
Allow Gateway Transit"| Spoke3 RT1 -.->|"0.0.0.0/0 → Firewall
10.2.0.0/16 → Firewall
10.3.0.0/16 → Firewall"| AFW RT2 -.->|"0.0.0.0/0 → Firewall
10.1.0.0/16 → Firewall
10.3.0.0/16 → Firewall"| AFW RT3 -.->|"0.0.0.0/0 → Firewall
10.1.0.0/16 → Firewall
10.2.0.0/16 → Firewall"| AFW Web1 -->|"App Traffic"| App1 App1 -->|"Database
Queries"| DB1 Workload2 -.->|"Via Firewall"| DNS App1 -.->|"Via Firewall"| DNS AFW -->|"Inspect & Log"| Internet Monitor -.->|"Collect Logs"| AFW Monitor -.->|"Collect Logs"| VPN Monitor -.->|"Collect Logs"| Bastion classDef hubStyle fill:#0078d4,stroke:#003d7a,stroke-width:2px,color:#fff classDef spokeStyle fill:#50e6ff,stroke:#0078d4,stroke-width:2px,color:#000 classDef securityStyle fill:#ff6b6b,stroke:#c92a2a,stroke-width:2px,color:#fff classDef gatewayStyle fill:#69db7c,stroke:#2b8a3e,stroke-width:2px,color:#000 class AFW securityStyle class VPN,Bastion gatewayStyle class Spoke1,Spoke2,Spoke3 spokeStyle ```

Click the diagram to view fullscreen

This diagram illustrates several key architectural points. The hub sits at the centre with four distinct subnets, each serving a specific purpose. Azure Firewall acts as the central inspection point for all inter-spoke traffic and internet-bound traffic. VPN Gateway provides the encrypted tunnel to on-premises networks, whilst Azure Bastion offers secure administrative access without exposing VMs to the internet.

Each spoke VNet peers directly with the hub but not with other spokes. The route tables in each spoke contain user-defined routes that send traffic destined for other spokes through the Azure Firewall’s private IP address. This ensures all spoke-to-spoke communication flows through the firewall where it can be inspected, logged, and controlled by centralised firewall policies.

Notice how the production spoke uses a traditional three-tier subnet design (web, application, data), whilst the development spoke has a simpler structure with a single workload subnet. The shared services spoke contains infrastructure used by multiple teams, like DNS servers and Active Directory. This flexibility in spoke design is one of the pattern’s strengths—each spoke can be structured according to its specific requirements whilst maintaining consistent connectivity and security through the hub.

Infrastructure as Code with Terraform

Now let’s translate these architectural principles into actual infrastructure. We’ll build this incrementally, starting with the hub and progressively adding spokes and connectivity.

Foundation and Variables

Start by defining the variables that will drive your configuration. Create variables.tf:

variable "environment" {
  description = "Environment name (e.g., prod, staging, dev)"
  type        = string
}

variable "location" {
  description = "Primary Azure region"
  type        = string
  default     = "uksouth"
}

variable "organisation" {
  description = "Organisation name for resource naming"
  type        = string
}

variable "hub_vnet_address_space" {
  description = "Address space for hub VNet"
  type        = list(string)
  default     = ["10.0.0.0/16"]
}

variable "spoke_vnets" {
  description = "Map of spoke VNets to create"
  type = map(object({
    address_space = list(string)
    subnets = map(object({
      address_prefix = string
      service_endpoints = optional(list(string), [])
      delegation = optional(object({
        name = string
        service_delegation = object({
          name    = string
          actions = optional(list(string), [])
        })
      }))
    }))
  }))
  default = {}
}

variable "enable_vpn_gateway" {
  description = "Whether to deploy VPN Gateway in hub"
  type        = bool
  default     = true
}

variable "enable_bastion" {
  description = "Whether to deploy Azure Bastion in hub"
  type        = bool
  default     = true
}

variable "on_premises_address_spaces" {
  description = "Address spaces of on-premises networks for VPN"
  type        = list(string)
  default     = []
}

variable "tags" {
  description = "Tags to apply to all resources"
  type        = map(string)
  default     = {}
}

Create locals.tf for computed values:

locals {
  # Common tags applied to all resources
  common_tags = merge(
    var.tags,
    {
      environment = var.environment
      managed_by  = "terraform"
      pattern     = "hub-and-spoke"
    }
  )

  # Resource naming convention
  hub_name = "hub-${var.environment}-${var.location}"
  
  # Hub subnet configuration
  hub_subnets = {
    firewall = {
      name           = "AzureFirewallSubnet"
      address_prefix = cidrsubnet(var.hub_vnet_address_space[0], 8, 0)
    }
    gateway = {
      name           = "GatewaySubnet"
      address_prefix = cidrsubnet(var.hub_vnet_address_space[0], 8, 1)
    }
    bastion = {
      name           = "AzureBastionSubnet"
      address_prefix = cidrsubnet(var.hub_vnet_address_space[0], 8, 2)
    }
    management = {
      name           = "snet-management"
      address_prefix = cidrsubnet(var.hub_vnet_address_space[0], 8, 3)
    }
  }
}

Hub Virtual Network

Create the hub VNet with all its subnets. Create hub.tf:

# Resource group for hub networking
resource "azurerm_resource_group" "hub" {
  name     = "rg-network-${local.hub_name}"
  location = var.location
  tags     = local.common_tags
}

# Hub virtual network
resource "azurerm_virtual_network" "hub" {
  name                = "vnet-${local.hub_name}"
  location            = azurerm_resource_group.hub.location
  resource_group_name = azurerm_resource_group.hub.name
  address_space       = var.hub_vnet_address_space

  tags = local.common_tags
}

# Hub subnets
resource "azurerm_subnet" "hub_firewall" {
  name                 = local.hub_subnets.firewall.name
  resource_group_name  = azurerm_resource_group.hub.name
  virtual_network_name = azurerm_virtual_network.hub.name
  address_prefixes     = [local.hub_subnets.firewall.address_prefix]
}

resource "azurerm_subnet" "hub_gateway" {
  count = var.enable_vpn_gateway ? 1 : 0

  name                 = local.hub_subnets.gateway.name
  resource_group_name  = azurerm_resource_group.hub.name
  virtual_network_name = azurerm_virtual_network.hub.name
  address_prefixes     = [local.hub_subnets.gateway.address_prefix]
}

resource "azurerm_subnet" "hub_bastion" {
  count = var.enable_bastion ? 1 : 0

  name                 = local.hub_subnets.bastion.name
  resource_group_name  = azurerm_resource_group.hub.name
  virtual_network_name = azurerm_virtual_network.hub.name
  address_prefixes     = [local.hub_subnets.bastion.address_prefix]
}

resource "azurerm_subnet" "hub_management" {
  name                 = local.hub_subnets.management.name
  resource_group_name  = azurerm_resource_group.hub.name
  virtual_network_name = azurerm_virtual_network.hub.name
  address_prefixes     = [local.hub_subnets.management.address_prefix]

  service_endpoints = [
    "Microsoft.Storage",
    "Microsoft.KeyVault"
  ]
}

# Network security group for management subnet
resource "azurerm_network_security_group" "hub_management" {
  name                = "nsg-${local.hub_name}-management"
  location            = azurerm_resource_group.hub.location
  resource_group_name = azurerm_resource_group.hub.name

  tags = local.common_tags
}

# NSG association
resource "azurerm_subnet_network_security_group_association" "hub_management" {
  subnet_id                 = azurerm_subnet.hub_management.id
  network_security_group_id = azurerm_network_security_group.hub_management.id
}

Azure Firewall

Azure Firewall provides the central security control point for the topology. Create firewall.tf:

# Public IP for Azure Firewall
resource "azurerm_public_ip" "firewall" {
  name                = "pip-firewall-${local.hub_name}"
  location            = azurerm_resource_group.hub.location
  resource_group_name = azurerm_resource_group.hub.name
  allocation_method   = "Static"
  sku                 = "Standard"
  zones               = ["1", "2", "3"]

  tags = local.common_tags
}

# Azure Firewall
resource "azurerm_firewall" "hub" {
  name                = "afw-${local.hub_name}"
  location            = azurerm_resource_group.hub.location
  resource_group_name = azurerm_resource_group.hub.name
  sku_name            = "AZFW_VNet"
  sku_tier            = "Standard"
  firewall_policy_id  = azurerm_firewall_policy.hub.id
  zones               = ["1", "2", "3"]

  ip_configuration {
    name                 = "configuration"
    subnet_id            = azurerm_subnet.hub_firewall.id
    public_ip_address_id = azurerm_public_ip.firewall.id
  }

  tags = local.common_tags
}

# Firewall Policy
resource "azurerm_firewall_policy" "hub" {
  name                = "afwp-${local.hub_name}"
  location            = azurerm_resource_group.hub.location
  resource_group_name = azurerm_resource_group.hub.name
  sku                 = "Standard"
  
  threat_intelligence_mode = "Alert"

  dns {
    proxy_enabled = true
  }

  intrusion_detection {
    mode = "Alert"
  }

  tags = local.common_tags
}

# Firewall Policy Rule Collection Group - Network Rules
resource "azurerm_firewall_policy_rule_collection_group" "network_rules" {
  name               = "network-rules"
  firewall_policy_id = azurerm_firewall_policy.hub.id
  priority           = 100

  network_rule_collection {
    name     = "allow-spoke-to-spoke"
    priority = 100
    action   = "Allow"

    rule {
      name                  = "allow-all-spoke-to-spoke"
      protocols             = ["Any"]
      source_addresses      = [for k, v in var.spoke_vnets : v.address_space[0]]
      destination_addresses = [for k, v in var.spoke_vnets : v.address_space[0]]
      destination_ports     = ["*"]
    }
  }

  network_rule_collection {
    name     = "allow-dns"
    priority = 110
    action   = "Allow"

    rule {
      name                  = "allow-dns-outbound"
      protocols             = ["UDP"]
      source_addresses      = [for k, v in var.spoke_vnets : v.address_space[0]]
      destination_addresses = ["*"]
      destination_ports     = ["53"]
    }
  }
}

# Firewall Policy Rule Collection Group - Application Rules
resource "azurerm_firewall_policy_rule_collection_group" "application_rules" {
  name               = "application-rules"
  firewall_policy_id = azurerm_firewall_policy.hub.id
  priority           = 200

  application_rule_collection {
    name     = "allow-azure-services"
    priority = 100
    action   = "Allow"

    rule {
      name = "allow-azure-management"
      protocols {
        type = "Https"
        port = 443
      }
      source_addresses = [for k, v in var.spoke_vnets : v.address_space[0]]
      destination_fqdns = [
        "*.azure.com",
        "*.microsoft.com",
        "*.windows.net",
        "*.azure-automation.net"
      ]
    }
  }

  application_rule_collection {
    name     = "allow-ubuntu-updates"
    priority = 110
    action   = "Allow"

    rule {
      name = "allow-apt-repositories"
      protocols {
        type = "Http"
        port = 80
      }
      protocols {
        type = "Https"
        port = 443
      }
      source_addresses = [for k, v in var.spoke_vnets : v.address_space[0]]
      destination_fqdns = [
        "*.ubuntu.com",
        "*.canonical.com"
      ]
    }
  }
}

VPN Gateway

For hybrid connectivity to on-premises networks, deploy VPN Gateway. Create vpn-gateway.tf:

# Public IP for VPN Gateway
resource "azurerm_public_ip" "vpn_gateway" {
  count = var.enable_vpn_gateway ? 1 : 0

  name                = "pip-vpngw-${local.hub_name}"
  location            = azurerm_resource_group.hub.location
  resource_group_name = azurerm_resource_group.hub.name
  allocation_method   = "Static"
  sku                 = "Standard"
  zones               = ["1", "2", "3"]

  tags = local.common_tags
}

# VPN Gateway
resource "azurerm_virtual_network_gateway" "hub" {
  count = var.enable_vpn_gateway ? 1 : 0

  name                = "vpngw-${local.hub_name}"
  location            = azurerm_resource_group.hub.location
  resource_group_name = azurerm_resource_group.hub.name

  type     = "Vpn"
  vpn_type = "RouteBased"

  active_active = false
  enable_bgp    = true
  sku           = "VpnGw2AZ"
  generation    = "Generation2"

  ip_configuration {
    name                          = "vnetGatewayConfig"
    public_ip_address_id          = azurerm_public_ip.vpn_gateway[0].id
    private_ip_address_allocation = "Dynamic"
    subnet_id                     = azurerm_subnet.hub_gateway[0].id
  }

  bgp_settings {
    asn = 65515
  }

  tags = local.common_tags
}

# Local Network Gateway for on-premises
resource "azurerm_local_network_gateway" "onpremises" {
  count = var.enable_vpn_gateway && length(var.on_premises_address_spaces) > 0 ? 1 : 0

  name                = "lng-onpremises-${local.hub_name}"
  location            = azurerm_resource_group.hub.location
  resource_group_name = azurerm_resource_group.hub.name

  gateway_address = "0.0.0.0" # Replace with actual on-premises gateway IP
  address_space   = var.on_premises_address_spaces

  bgp_settings {
    asn                 = 65000
    bgp_peering_address = "192.168.1.1" # Replace with actual on-premises BGP peer IP
  }

  tags = local.common_tags
}

Azure Bastion

For secure remote access to virtual machines, deploy Azure Bastion. Create bastion.tf:

# Public IP for Azure Bastion
resource "azurerm_public_ip" "bastion" {
  count = var.enable_bastion ? 1 : 0

  name                = "pip-bastion-${local.hub_name}"
  location            = azurerm_resource_group.hub.location
  resource_group_name = azurerm_resource_group.hub.name
  allocation_method   = "Static"
  sku                 = "Standard"
  zones               = ["1", "2", "3"]

  tags = local.common_tags
}

# Azure Bastion
resource "azurerm_bastion_host" "hub" {
  count = var.enable_bastion ? 1 : 0

  name                = "bastion-${local.hub_name}"
  location            = azurerm_resource_group.hub.location
  resource_group_name = azurerm_resource_group.hub.name
  sku                 = "Standard"

  copy_paste_enabled     = true
  file_copy_enabled      = true
  ip_connect_enabled     = true
  shareable_link_enabled = false
  tunneling_enabled      = true

  ip_configuration {
    name                 = "configuration"
    subnet_id            = azurerm_subnet.hub_bastion[0].id
    public_ip_address_id = azurerm_public_ip.bastion[0].id
  }

  tags = local.common_tags
}

Spoke Virtual Networks

Create spoke VNets dynamically based on the input variable. Create spokes.tf:

# Resource groups for spoke VNets
resource "azurerm_resource_group" "spokes" {
  for_each = var.spoke_vnets

  name     = "rg-network-spoke-${each.key}-${var.environment}"
  location = var.location
  tags     = local.common_tags
}

# Spoke virtual networks
resource "azurerm_virtual_network" "spokes" {
  for_each = var.spoke_vnets

  name                = "vnet-spoke-${each.key}-${var.environment}"
  location            = azurerm_resource_group.spokes[each.key].location
  resource_group_name = azurerm_resource_group.spokes[each.key].name
  address_space       = each.value.address_space

  tags = local.common_tags
}

# Spoke subnets
resource "azurerm_subnet" "spoke_subnets" {
  for_each = merge([
    for spoke_key, spoke in var.spoke_vnets : {
      for subnet_key, subnet in spoke.subnets :
      "${spoke_key}-${subnet_key}" => merge(subnet, {
        spoke_key = spoke_key
        vnet_name = azurerm_virtual_network.spokes[spoke_key].name
        rg_name   = azurerm_resource_group.spokes[spoke_key].name
      })
    }
  ]...)

  name                 = "snet-${each.value.spoke_key}-${split("-", each.key)[1]}"
  resource_group_name  = each.value.rg_name
  virtual_network_name = each.value.vnet_name
  address_prefixes     = [each.value.address_prefix]
  service_endpoints    = each.value.service_endpoints

  dynamic "delegation" {
    for_each = each.value.delegation != null ? [each.value.delegation] : []
    content {
      name = delegation.value.name

      service_delegation {
        name    = delegation.value.service_delegation.name
        actions = delegation.value.service_delegation.actions
      }
    }
  }
}

# Network security groups for spoke subnets
resource "azurerm_network_security_group" "spoke_subnets" {
  for_each = azurerm_subnet.spoke_subnets

  name                = "nsg-${each.key}"
  location            = var.location
  resource_group_name = each.value.resource_group_name

  tags = local.common_tags
}

# NSG associations
resource "azurerm_subnet_network_security_group_association" "spoke_subnets" {
  for_each = azurerm_subnet.spoke_subnets

  subnet_id                 = each.value.id
  network_security_group_id = azurerm_network_security_group.spoke_subnets[each.key].id
}

VNet Peering

Connect spokes to the hub through VNet peering. Create peering.tf:

# Hub to spoke peering
resource "azurerm_virtual_network_peering" "hub_to_spoke" {
  for_each = var.spoke_vnets

  name                      = "peer-hub-to-${each.key}"
  resource_group_name       = azurerm_resource_group.hub.name
  virtual_network_name      = azurerm_virtual_network.hub.name
  remote_virtual_network_id = azurerm_virtual_network.spokes[each.key].id

  allow_virtual_network_access = true
  allow_forwarded_traffic      = true
  allow_gateway_transit        = var.enable_vpn_gateway
  use_remote_gateways          = false
}

# Spoke to hub peering
resource "azurerm_virtual_network_peering" "spoke_to_hub" {
  for_each = var.spoke_vnets

  name                      = "peer-${each.key}-to-hub"
  resource_group_name       = azurerm_resource_group.spokes[each.key].name
  virtual_network_name      = azurerm_virtual_network.spokes[each.key].name
  remote_virtual_network_id = azurerm_virtual_network.hub.id

  allow_virtual_network_access = true
  allow_forwarded_traffic      = true
  allow_gateway_transit        = false
  use_remote_gateways          = var.enable_vpn_gateway

  depends_on = [
    azurerm_virtual_network_gateway.hub
  ]
}

Routing Configuration

Configure user-defined routes to direct traffic through Azure Firewall. Create routing.tf:

# Route table for spoke VNets
resource "azurerm_route_table" "spokes" {
  for_each = var.spoke_vnets

  name                = "rt-spoke-${each.key}-${var.environment}"
  location            = azurerm_resource_group.spokes[each.key].location
  resource_group_name = azurerm_resource_group.spokes[each.key].name

  disable_bgp_route_propagation = false

  tags = local.common_tags
}

# Route to send internet traffic through firewall
resource "azurerm_route" "spoke_internet_via_firewall" {
  for_each = var.spoke_vnets

  name                   = "route-internet-via-firewall"
  resource_group_name    = azurerm_resource_group.spokes[each.key].name
  route_table_name       = azurerm_route_table.spokes[each.key].name
  address_prefix         = "0.0.0.0/0"
  next_hop_type          = "VirtualAppliance"
  next_hop_in_ip_address = azurerm_firewall.hub.ip_configuration[0].private_ip_address
}

# Routes to send spoke-to-spoke traffic through firewall
resource "azurerm_route" "spoke_to_spoke_via_firewall" {
  for_each = merge([
    for spoke_key, spoke in var.spoke_vnets : {
      for other_spoke_key, other_spoke in var.spoke_vnets :
      "${spoke_key}-to-${other_spoke_key}" => {
        source_spoke      = spoke_key
        destination_spoke = other_spoke_key
        address_prefix    = other_spoke.address_space[0]
      }
      if spoke_key != other_spoke_key
    }
  ]...)

  name                   = "route-to-${each.value.destination_spoke}"
  resource_group_name    = azurerm_resource_group.spokes[each.value.source_spoke].name
  route_table_name       = azurerm_route_table.spokes[each.value.source_spoke].name
  address_prefix         = each.value.address_prefix
  next_hop_type          = "VirtualAppliance"
  next_hop_in_ip_address = azurerm_firewall.hub.ip_configuration[0].private_ip_address
}

# Associate route tables with spoke subnets
resource "azurerm_subnet_route_table_association" "spoke_subnets" {
  for_each = azurerm_subnet.spoke_subnets

  subnet_id      = each.value.id
  route_table_id = azurerm_route_table.spokes[each.value.virtual_network_name == azurerm_virtual_network.spokes[split("-", each.key)[0]].name ? split("-", each.key)[0] : ""].id
}

Monitoring and Diagnostics

Implement comprehensive monitoring for the network infrastructure. Create monitoring.tf:

# Log Analytics workspace
resource "azurerm_log_analytics_workspace" "network" {
  name                = "log-network-${var.environment}"
  location            = azurerm_resource_group.hub.location
  resource_group_name = azurerm_resource_group.hub.name
  sku                 = "PerGB2018"
  retention_in_days   = 30

  tags = local.common_tags
}

# Diagnostic settings for Azure Firewall
resource "azurerm_monitor_diagnostic_setting" "firewall" {
  name                       = "firewall-diagnostics"
  target_resource_id         = azurerm_firewall.hub.id
  log_analytics_workspace_id = azurerm_log_analytics_workspace.network.id

  enabled_log {
    category = "AzureFirewallApplicationRule"
  }

  enabled_log {
    category = "AzureFirewallNetworkRule"
  }

  enabled_log {
    category = "AzureFirewallDnsProxy"
  }

  metric {
    category = "AllMetrics"
    enabled  = true
  }
}

# Diagnostic settings for VPN Gateway
resource "azurerm_monitor_diagnostic_setting" "vpn_gateway" {
  count = var.enable_vpn_gateway ? 1 : 0

  name                       = "vpngw-diagnostics"
  target_resource_id         = azurerm_virtual_network_gateway.hub[0].id
  log_analytics_workspace_id = azurerm_log_analytics_workspace.network.id

  enabled_log {
    category = "GatewayDiagnosticLog"
  }

  enabled_log {
    category = "TunnelDiagnosticLog"
  }

  enabled_log {
    category = "RouteDiagnosticLog"
  }

  enabled_log {
    category = "IKEDiagnosticLog"
  }

  metric {
    category = "AllMetrics"
    enabled  = true
  }
}

# Diagnostic settings for Bastion
resource "azurerm_monitor_diagnostic_setting" "bastion" {
  count = var.enable_bastion ? 1 : 0

  name                       = "bastion-diagnostics"
  target_resource_id         = azurerm_bastion_host.hub[0].id
  log_analytics_workspace_id = azurerm_log_analytics_workspace.network.id

  enabled_log {
    category = "BastionAuditLogs"
  }

  metric {
    category = "AllMetrics"
    enabled  = true
  }
}

# Network Watcher
resource "azurerm_network_watcher" "main" {
  name                = "nw-${var.environment}-${var.location}"
  location            = var.location
  resource_group_name = azurerm_resource_group.hub.name

  tags = local.common_tags
}

Outputs

Export important values for reference. Create outputs.tf:

output "hub_vnet_id" {
  description = "Resource ID of the hub VNet"
  value       = azurerm_virtual_network.hub.id
}

output "hub_vnet_name" {
  description = "Name of the hub VNet"
  value       = azurerm_virtual_network.hub.name
}

output "firewall_private_ip" {
  description = "Private IP address of Azure Firewall"
  value       = azurerm_firewall.hub.ip_configuration[0].private_ip_address
}

output "spoke_vnet_ids" {
  description = "Map of spoke VNet resource IDs"
  value = {
    for k, v in azurerm_virtual_network.spokes : k => v.id
  }
}

output "spoke_subnet_ids" {
  description = "Map of spoke subnet resource IDs"
  value = {
    for k, v in azurerm_subnet.spoke_subnets : k => v.id
  }
}

output "bastion_fqdn" {
  description = "FQDN of Azure Bastion"
  value       = var.enable_bastion ? azurerm_bastion_host.hub[0].dns_name : null
}

output "vpn_gateway_public_ip" {
  description = "Public IP address of VPN Gateway"
  value       = var.enable_vpn_gateway ? azurerm_public_ip.vpn_gateway[0].ip_address : null
}

Example Variables File

Create terraform.tfvars.example:

environment  = "prod"
location     = "uksouth"
organisation = "contoso"

hub_vnet_address_space = ["10.0.0.0/16"]

spoke_vnets = {
  production = {
    address_space = ["10.1.0.0/16"]
    subnets = {
      web = {
        address_prefix    = "10.1.1.0/24"
        service_endpoints = ["Microsoft.Storage", "Microsoft.KeyVault"]
      }
      app = {
        address_prefix    = "10.1.2.0/24"
        service_endpoints = ["Microsoft.Storage", "Microsoft.Sql"]
      }
      data = {
        address_prefix    = "10.1.3.0/24"
        service_endpoints = ["Microsoft.Sql", "Microsoft.Storage"]
      }
    }
  }
  development = {
    address_space = ["10.2.0.0/16"]
    subnets = {
      workloads = {
        address_prefix    = "10.2.1.0/24"
        service_endpoints = ["Microsoft.Storage"]
      }
    }
  }
}

enable_vpn_gateway = true
enable_bastion     = true

on_premises_address_spaces = ["192.168.0.0/16"]

tags = {
  cost_centre = "platform"
  project     = "network-infrastructure"
}

Security Best Practices

Security in a hub and spoke topology requires defence in depth—multiple layers of protection that work together to prevent, detect, and respond to threats.

Network Segmentation

The hub and spoke pattern inherently provides segmentation by isolating workloads into separate virtual networks. But segmentation within spokes is equally important. Use subnets to create security boundaries between tiers of your application.

A three-tier application should have three subnets minimum: web tier, application tier, and data tier. Network security groups on each subnet enforce which traffic is allowed. The web tier subnet allows inbound HTTPS from the internet and outbound connections to the application tier. The application tier allows inbound connections only from the web tier and outbound connections only to the data tier. The data tier allows inbound connections only from the application tier.

This creates a security posture where even if an attacker compromises the web tier, they cannot directly access databases in the data tier. They must traverse multiple network boundaries, each with its own controls, giving your security team multiple opportunities to detect and block the attack.

Azure Firewall Configuration

Azure Firewall’s power lies in its policy-based approach to network security. Rather than configuring individual firewall rules on every virtual machine, you define policies centrally and Azure Firewall enforces them for all traffic flowing through the hub.

Start with a deny-all default posture and explicitly allow only the traffic you need. The application rules we configured allow access to Azure services and Ubuntu repositories, but block everything else by default. As your organisation grows, add more specific rules rather than opening up broad access.

Enable threat intelligence in Alert mode initially, which logs potential threats without blocking them. Once you understand your traffic patterns, switch to Alert and Deny mode to actively block connections to known malicious IP addresses and domains.

The DNS proxy feature is particularly valuable. When enabled, Azure Firewall acts as a DNS server for your spokes, allowing you to create firewall rules based on FQDNs rather than IP addresses. This is crucial for SaaS applications and Azure services where IP addresses change frequently.

Network Security Groups

Network security groups provide granular control at the subnet or network interface level. They’re your last line of defence when Azure Firewall is the first line.

Design NSG rules with specificity. Instead of allowing all TCP traffic, specify the exact ports your application needs. Instead of allowing traffic from any source, limit it to the specific subnets or IP addresses that should have access.

Use application security groups (ASGs) to simplify NSG rule management. Instead of specifying IP addresses in NSG rules, you specify ASGs. Then assign virtual machine network interfaces to ASGs based on their role. When you need to change firewall rules for all web servers, you modify the NSG rule once rather than updating rules for each individual server.

Tag your NSG rules with descriptive names and comments explaining their purpose. Future you, investigating a security incident at 3am, will appreciate knowing why a particular rule exists and what business requirement it serves.

Private Endpoints

For Azure PaaS services like Storage Accounts, SQL Databases, and Key Vaults, use private endpoints to access them over private IP addresses from your virtual network. This keeps data plane traffic entirely within Azure’s network backbone, never traversing the public internet.

Private endpoints create a network interface in your VNet with a private IP address. DNS resolution for the PaaS service hostname returns this private IP address instead of the public IP address. Your applications access the service over the private connection without any code changes.

Organise private endpoints in dedicated subnets within your spoke VNets. This makes it easier to manage network security group rules and DNS configuration. A subnet specifically for private endpoints can have restrictive NSG rules since it only contains network interfaces, not compute resources.

Just-in-Time Access

Azure Bastion eliminates the need for virtual machines with public IP addresses, but some scenarios still require SSH or RDP access. Azure Defender for Cloud’s Just-in-Time (JIT) VM Access feature provides a middle ground.

JIT keeps management ports closed by default and opens them only when needed, for a limited time, and only for authorised users. When you need to access a virtual machine, you request access through the Azure portal or CLI. Azure opens the NSG rules for your specific IP address, allows you to connect, and automatically closes the rules after the time period expires.

This dramatically reduces the attack surface. Rather than having RDP or SSH ports exposed 24/7, they’re only open for the brief periods when legitimate administrators need access. Even then, access is restricted to specific source IP addresses.

Audit and Compliance

Enable diagnostic logging for all network resources. Azure Firewall logs every allowed and denied connection. VPN Gateway logs all tunnel events and configuration changes. Bastion logs every remote access session. This comprehensive audit trail is invaluable for security investigations and compliance requirements.

Send all logs to Azure Monitor Log Analytics where you can query them using KQL, create alerts for suspicious activity, and build dashboards for security operations teams. Configure alerts for events like firewall rule changes, VPN tunnel failures, or unusual traffic patterns.

For compliance requirements like PCI DSS or HIPAA, the hub and spoke topology provides clear network boundaries that align with compliance scopes. Your PCI DSS environment can live in a dedicated spoke with strict firewall rules controlling all access. Audit logs prove that traffic between scopes flows through inspected and logged network paths.

Operational Considerations

Building the infrastructure is the first step. Operating it successfully over time requires attention to several ongoing concerns.

Capacity Planning

Monitor Azure Firewall throughput to ensure it doesn’t become a bottleneck. The Standard SKU supports up to 30 Gbps of throughput, but actual performance depends on your specific traffic patterns and rule complexity. If you approach capacity limits, consider upgrading to the Premium SKU or deploying multiple firewall instances.

VPN Gateway capacity varies by SKU. The VpnGw1AZ SKU supports up to 650 Mbps aggregate throughput and 30 tunnels. The VpnGw5AZ SKU supports up to 10 Gbps and 100 tunnels. Plan for growth when selecting SKU sizes—upgrading later requires brief downtime.

Spoke VNet capacity rarely becomes an issue since each spoke can have up to 65,536 IP addresses with a /16 allocation. But subnet sizes within spokes require planning. A /24 subnet provides 251 usable addresses (Azure reserves five addresses per subnet). For auto-scaling workloads, ensure subnets have enough addresses to accommodate peak scale.

Cost Optimisation

Hub and spoke topologies have predictable cost components. Azure Firewall costs around £1,000 per month for the firewall itself plus data processing charges. VPN Gateway costs between £100 and £1,500 per month depending on SKU. Azure Bastion costs around £100 per month.

VNet peering incurs charges for data transfer between VNets. Peering between VNets in the same region costs £0.01 per GB in each direction. This seems small but compounds with high traffic volumes. If two services need to exchange large volumes of data frequently, consider deploying them in the same spoke VNet rather than separate spokes.

Enable Azure Advisor cost recommendations and review them regularly. Advisor identifies unused resources like VPN Gateway connections that haven’t carried traffic in weeks or public IP addresses that aren’t associated with any resource.

High Availability

The hub becomes a single point of failure if not designed for high availability. Azure Firewall supports zone redundancy, spreading instances across availability zones. If one zone fails, the firewall continues operating from the other zones. Enable zone redundancy by specifying zones during firewall creation.

VPN Gateway also supports zone redundancy with the VpnGwNAZ SKUs. These deploy gateway instances across availability zones, ensuring VPN connectivity remains available even during zone failures. For mission-critical connectivity, deploy both VPN Gateway and ExpressRoute, configuring VPN as a backup path if ExpressRoute fails.

Azure Bastion’s Standard SKU includes high availability with multiple instances deployed automatically. Host scaling adjusts the number of instances based on concurrent sessions, ensuring performance during peak usage.

Consider deploying hub and spoke topologies in multiple regions for disaster recovery. Create a hub in your primary region and another in your secondary region. Spoke VNets in each region peer with their regional hub. Use Azure Traffic Manager or Azure Front Door to distribute traffic between regions.

Automation and GitOps

Store all Terraform configuration in version control and use CI/CD pipelines to deploy changes. This provides audit trails showing who changed what and when, makes it easy to rollback problematic changes, and ensures consistency across environments.

Use Terraform workspaces or separate state files for different environments. Development, staging, and production environments should be identical in structure but isolated in deployment. This allows you to test network changes in development before applying them to production.

Implement automated testing for your network infrastructure. Tools like Terratest can validate that your Terraform code creates the expected resources with the correct configuration. Network tests can verify that spoke VNets can communicate through the firewall, that internet access works, and that on-premises connectivity functions correctly.

Troubleshooting Common Issues

Spoke-to-spoke connectivity issues usually stem from routing or firewall rule problems. Use Azure Network Watcher’s Connection Troubleshoot feature to test connectivity between virtual machines in different spokes. It shows you the exact network path traffic takes and identifies where it’s being blocked.

VPN connectivity problems often relate to on-premises firewall configuration or IP addressing conflicts. Ensure on-premises firewalls allow UDP ports 500 and 4500 for IKE traffic. Verify that on-premises address spaces don’t overlap with Azure VNet address spaces. Use VPN Gateway diagnostic logs to identify where tunnel negotiation is failing.

DNS resolution issues can break private endpoint connectivity. When using Azure Firewall’s DNS proxy, ensure spoke VNets configure the firewall’s private IP address as their DNS server. Use the nslookup command from spoke virtual machines to verify DNS resolution returns private IP addresses for PaaS services with private endpoints.

Advanced Patterns and Extensions

Once you’ve mastered the basic hub and spoke topology, several advanced patterns extend its capabilities.

Multi-Region Hub and Spoke

Global organisations often deploy hub and spoke topologies in multiple Azure regions. Each region has its own hub VNet with Azure Firewall, VPN Gateway, and Bastion. Regional spoke VNets peer with their regional hub.

Connect regional hubs using Global VNet Peering, which allows VNets in different regions to communicate over Microsoft’s global backbone network. This enables workloads in spoke VNets in different regions to communicate whilst keeping regional traffic within the region.

Use Azure Virtual WAN as an alternative to manually peering regional hubs. Virtual WAN provides a managed hub and spoke topology with built-in routing, VPN connectivity, and optimised global connectivity. It simplifies management but costs more than self-managed hubs.

Shared Services Spoke

Some organisations create a dedicated shared services spoke within the hub and spoke topology. This spoke contains services used by multiple application spokes: Active Directory domain controllers, DNS servers, monitoring infrastructure, or DevOps tools.

The shared services spoke peers with the hub like any other spoke but receives special firewall rules allowing it to accept connections from all other spokes. Application spokes can initiate connections to shared services, but shared services cannot initiate connections to application spokes, maintaining security boundaries.

Network Virtual Appliances

Whilst Azure Firewall meets most requirements, some scenarios need third-party network virtual appliances (NVAs) for features like SD-WAN, advanced threat detection, or integration with existing on-premises security infrastructure.

Deploy NVAs in the hub VNet in a dedicated subnet. Configure user-defined routes to send traffic through the NVA instead of Azure Firewall. Many NVA vendors provide Azure Marketplace images and ARM templates that simplify deployment.

NVAs require careful capacity planning and high availability design. Deploy multiple instances in an availability set or availability zones with a load balancer distributing traffic between them. Monitor NVA performance and scale instances as traffic grows.

Azure Firewall Manager

For organisations with multiple hub and spoke topologies across regions or subscriptions, Azure Firewall Manager centralises policy management. Instead of configuring firewall rules separately in each hub, you define policies once in Firewall Manager and apply them to multiple firewalls.

Firewall Manager supports policy inheritance, where child policies inherit rules from parent policies and add environment-specific rules. This allows you to define organisation-wide security policies centrally whilst giving regional teams flexibility to add rules for their specific needs.

Conclusion

The hub and spoke network topology represents a mature, battle-tested approach to Azure network architecture. It solves the fundamental challenges of cloud networking—providing isolation between workloads whilst enabling controlled connectivity, centralising security enforcement without creating bottlenecks, and scaling elegantly from small deployments to global enterprise platforms.

Building a production-ready hub and spoke topology requires careful planning around IP addressing, routing, security, and high availability. The Terraform code we’ve built demonstrates these principles in practice, creating infrastructure that’s secure by default, scales with your organisation, and provides the observability needed for effective operations.

The initial investment in building a proper hub and spoke topology pays dividends over time. Each new spoke you add follows the established pattern, inheriting security controls and connectivity automatically. Your platform engineering team maintains central control over security policy and hybrid connectivity whilst development teams get the isolation and autonomy they need to move quickly.

As your Azure footprint grows, the hub and spoke pattern grows with you. Additional spokes, regional hubs, and advanced features like Azure Firewall Manager extend the topology without requiring fundamental redesign. You’ve built a foundation that will serve your organisation reliably for years, providing the network infrastructure on which thousands of workloads can run securely and efficiently.

Leave a comment