Infrastructure as Code Platforms

IaC platforms promise declarative, version-controlled infrastructure – but network reality reveals a “Day-2 operations gap” where provisioning tools struggle with ongoing operational workflows.

Why This Research Matters

Infrastructure as Code transformed cloud operations, and the promise is seductive for network teams: declarative configurations, automated provisioning, and version-controlled infrastructure. Terraform’s 70%+ market share proves the appeal. But the reality of network automation with IaC tools is more nuanced than vendor marketing suggests.

The fundamental challenge: IaC platforms excel at provisioning (20-30% of network service delivery) but struggle with operations (the remaining 70-80%). A firewall rule deployed successfully doesn’t mean traffic flows correctly. Cloud resources can be destroyed and recreated; network configurations affect live traffic and can’t be “rolled back” without careful planning.

This analysis examines what actually happens when you apply IaC methodologies to network infrastructure – not the demo scenarios, but the production realities of provider quality inconsistencies, state management complexity, and the operational workflows that IaC wasn’t designed to handle.

What you’ll find:
  • Provider quality assessment across three tiers – from rock-solid AWS providers to unreliable third-party network providers

  • State management failure modes with real-world scenarios of corruption, drift, and multi-team conflicts

  • The Day-2 operations gap quantified: why provisioning tools struggle with troubleshooting, validation, and operational tasks

  • Platform comparison frameworks scoring Terraform, Pulumi, and CloudFormation across network-specific criteria

  • Strategic recommendations for when IaC works brilliantly vs. when you need orchestration


Who this is for:
  • Teams evaluating whether Terraform/Pulumi can replace vendor management platforms
  • Organizations struggling with IaC provider quality or state management complexity
  • Leaders wondering why infrastructure automation succeeds in cloud but struggles with network devices
  • Engineers deciding whether to invest deeper in IaC or look toward orchestration platforms

The goal: Help you understand where IaC tools fit in network automation architectures – and where their architectural limitations require complementary approaches.

Key Research Findings

Platform Market Share Network Provider Quality Primary Challenge
Terraform 70%+ (dominant) Highly variable by vendor Provider quality inconsistency
Pulumi 15% (growing) Programming language complexity Steep learning curve for network teams
CloudFormation 10% (AWS-specific) Limited network device support Cloud-only focus
OpenTofu 5% (emerging) Terraform compatibility Community development uncertainty

Terraform:
The Market Leader’s Network Reality

What Terraform Does Well

Terraform’s dominance stems from several architectural advantages:

  • Declarative Approach: Define desired end state, let Terraform determine implementation steps

  • State Management: Comprehensive tracking of managed resources with drift detection

  • Provider Ecosystem: 3,000+ providers covering diverse platforms and services

  • Plan/Apply Workflow: Preview changes before execution with detailed impact analysis

  • Resource Dependencies: Automatic dependency resolution and ordering

The Network Provider Quality Divide

Research reveals dramatic quality differences between cloud and network device providers:

Tier 1: Cloud Platform Providers (AWS, Azure, GCP)
# AWS provider: Comprehensive, reliable, well-tested 
resource "aws_security_group_rule" "example" {
  type        = "ingress"
  from_port.  = 80
  to_port     = 80
  protocol    = "tcp"
  cidr_blocks = ["0.0.0.0/0"]
  security_group_id = aws_security_group.example.id 
} 
# Result: Predictable behavior, extensive documentation, regular updates

Quality Characteristics:

  • Feature Coverage: 95%+ of platform capabilities supported

  • Update Frequency: Weekly releases with new features

  • Documentation: Comprehensive with examples and best practices

  • Community Support: Large user base with extensive troubleshooting resources

  • Stability: Rare breaking changes, extensive testing before release

Tier 2: Network Device Providers (Variable Quality)

High-Quality Network Providers:

# Cisco ACI provider: Well-maintained, comprehensive 
resource "aci_tenant" "example" {
  name        = "prod_tenant"
  description = "Production tenant" 
} 

resource "aci_application_profile" "example" {
  tenant_dn = aci_tenant.example.id
  name      = "web_app" 
}

Medium-Quality Network Providers:

# PAN-OS provider: Functional but with limitations 
resource "panos_security_rule" "allow_web" {
  rule_type = "universal"
  name = "Allow Web Traffic"
  source = ["internal"]
  destination = ["dmz"]
  application = ["web-browsing", "ssl"]
  action = "allow"  
# Note: Many advanced features missing  
# Version 2.0.0 had complete schema redesign 
}

Research Evidence on Provider Evolution:

  • PAN-OS v1.x: 4MB resource limit required workarounds for large configurations

  • PAN-OS v2.0.0: Complete schema redesign with no automatic upgrade path

  • Result: Organizations required manual reconfiguration and state file reconstruction

Tier 3: Community/Experimental Providers

Analysis of third-party providers reveals significant quality control issues:

“Users report more problems with certain third-party providers in a few months than with all HashiCorp providers combined over years” (HashiCorp Discuss, 2023).

Common Third-Party Provider Issues:

  • Incomplete Feature Coverage: 30-60% of device capabilities missing

  • Documentation Gaps: Limited examples and troubleshooting guidance

  • Update Inconsistency: Irregular maintenance and security patching

  • Breaking Changes: Frequent API changes without deprecation warnings

State Management: The Double-Edged Sword

The State File Advantage
# Terraform tracks this configuration in state 
resource "cisco_interface" "management" {
  name        = "GigabitEthernet0/0"
  ip_address  = "192.168.1.100"
  subnet_mask = "255.255.255.0"
  description = "Management interface" 
}
What State Management Enables:
  • Change Detection: terraform plan shows exactly what will change

  • Drift Detection: Identifies manual changes made outside Terraform

  • Dependency Tracking: Understands resource relationships and dependencies

  • Rollback Capability: Can return to previous known-good state

State Management Failure Modes

Research Finding: State management complexity is cited as a major challenge in IaC refactoring projects, particularly for legacy infrastructure (Stanley, 2025).

Critical State File Problems:

1. State File Corruption

  • Network interruptions during apply operations
  • Concurrent modifications by multiple team members
  • Provider bugs writing invalid state data
  • Impact: Resources become unmanageable, requiring manual cleanup

2. Manual Change Conflicts

# Common scenario: Emergency network change bypasses Terraform 
$ terraform plan 
# Error: Configuration drift detected 
# Manual change: VLAN 200 added directly to switch 
# Terraform doesn't know about VLAN 200 
# Next apply might remove or conflict with manual change

3. Multi-Team Coordination

  • State locking failures in distributed teams
  • Different team members managing overlapping resources
  • Inconsistent backend configuration across team members

Complex Logic Limitations in HCL

Network automation often requires sophisticated logic that HCL struggles to express:

# This becomes unwieldy quickly in HCL 
resource "cisco_interface" "access_ports" {
  for_each = {
    for port in var.access_ports : port.name => port
    if port.enabled == true && port.vlan != null && port.security_policy != "disabled"  
  }

  name = each.value.name
  access_vlan = each.value.vlan

  # Error handling? Limited options
  # Conditional configuration? Difficult to read
  # Multi-step workflows? Requires external orchestration

  lifecycle {
    # Prevent destruction if manually modified
    prevent_destroy = true
    ignore_changes = [
      # Which fields to ignore? Hard to predict
    ]
   }
  }
HCL Limitations for Network Logic:
  • Conditional Operations: Limited if/then/else capabilities

  • Error Handling: No try/catch mechanisms for network-specific failures

  • Loops and Iteration: Basic for_each, no complex iteration patterns

  • String Manipulation: Limited functions for network address calculations

  • External Dependencies: Cannot wait for network convergence or validation

Real-World Network Pain Points

Day-2 Operations Gap
# Scenario: Firewall rule deployed successfully 
$ terraform apply 
Apply complete! Resources: 1 added, 0 changed, 0 destroyed. 

# Reality: Traffic still not flowing 
# Need to troubleshoot: 
# - Routing tables 
# - NAT configurations  
# - Other firewall rules 
# - Network connectivity 
# Terraform: "I did my job, rule is configured"

Research Evidence: Case studies show that infrastructure provisioning represents only 20-30% of network service delivery, with validation, testing, and integration consuming 70-80% of implementation effort (Network Service Delivery, 2023).

Rollback Complexity in Network Environments

Unlike cloud resources that can be easily destroyed and recreated, network configurations have complex dependencies:

# Cloud approach (works well): 
resource "aws_instance" "web" {
  # If something goes wrong: destroy and recreate
  # Impact: Minimal, applications designed for ephemerality 
} 

# Network approach (problematic): 
resource "cisco_bgp_neighbor" "peer" {
  # If something goes wrong: can't just destroy
  # Impact: Network outage, routing disruption, service impact
  # Solution: Requires careful rollback planning and validation 
}

When Terraform Works Best vs. Struggles

Optimal Network Use Cases:
  • Infrastructure Provisioning: Cloud networking, VPCs, security groups

  • Stable Configuration Management: Firewall rules, load balancer configs

  • Multi-Cloud Environments: Consistent interfaces across cloud providers

  • Teams with DevOps Experience: Organizations comfortable with IaC methodologies

Problem Scenarios:
  • Complex Multi-Step Operations: Network service delivery requiring business logic

  • Real-Time Operational Tasks: Troubleshooting, performance optimization, incident response

  • Frequent Configuration Changes: Dynamic environments with daily modifications

  • Legacy Network Integration: Older devices without modern API support

Pulumi:
The Developer-Centric Alternative

What Pulumi Does Well

Pulumi addresses some Terraform limitations through programming language support:

  • Real Programming Languages: Python, TypeScript, Go, C#, Java support

  • Rich Logic Support: Full programming language capabilities for complex scenarios

  • Strong Typing: Compile-time validation and better IDE support

  • Familiar Tooling: Use existing development tools and practices

Network Implementation with Pulumi

# Python example: More sophisticated logic possible 
import pulumi 
from pulumi_aws import ec2 

def create_security_rules(app_config):
    rules = []

    for app in app_config:
        # Complex logic easier to express in Python
        if app.environment == "production":
            source_cidrs = app.trusted_networks
        else:
            source_cidrs = ["10.0.0.0/8"]

        for port in app.required_ports:
           rule = ec2.SecurityGroupRule(
                f"{app.name}-{port}",
                type="ingress",
                from_port=port,
                to_port=port,
                protocol="tcp",
                cidr_blocks=source_cidrs,
                security_group_id=app.security_group_id
            )
            rules.append(rule)

     return rules

# Error handling with try/catch 
try:
   network_config = create_security_rules(applications) 
except Exception as e:
   pulumi.log.error(f"Network configuration failed: {e}")
   # Implement custom error handling

The Network Team Adoption Challenge

Research Finding: Programming-based IaC tools show 60% slower adoption rates among traditional network teams compared to declarative approaches (Infrastructure Automation Adoption, 2024).

Adoption Barriers:

  • Skill Gap: Network engineers typically lack programming language expertise

  • Training Investment: 6-12 months to achieve proficiency in Python/TypeScript

  • Tool Complexity: IDE setup, dependency management, testing frameworks

  • Culture Shift: Moving from CLI-based to code-based network management

When Pulumi Works Best vs. Struggles

Optimal Use Cases:
  • Developer-Heavy Teams: Organizations with strong programming capabilities

  • Complex Logic Requirements: Multi-step workflows requiring sophisticated decision trees

  • Integration-Heavy Environments: Extensive API integration and data transformation needs

  • DevOps-Native Organizations: Teams already using programming-based infrastructure tools

Problem Scenarios:
  • Traditional Network Teams: Engineers without programming language background

  • Simple Configuration Management: Basic tasks that don’t require programming complexity

  • Rapid Prototyping: When declarative simplicity is preferred over programming power

  • Vendor-Specific Tools: Where network-specific platforms provide better domain fit

CloudFormation:
The AWS-Native Approach

Strengths & Limitations

CloudFormation Advantages:
  • Native AWS Integration: Deep integration with all AWS services

  • No Additional Tools: Built into AWS console and CLI

  • IAM Integration: Leverages AWS permissions and security models

  • Stack Management: Comprehensive resource lifecycle management

Network Automation Limitations:
  • AWS-Only: Cannot manage multi-cloud or on-premises network infrastructure
  • Limited Network Device Support: No support for physical network devices
  • Complex Syntax: JSON/YAML templates become unwieldy for complex configurations
  • Vendor Lock-In: Ties network automation to AWS ecosystem

Decision Framework for IaC Platform Selection

Evaluation Criteria Terrform Pulumi CloudFormation Weight
Learning Curve Medium High Medium High
Network Provider Quality Variable Limited AWS-only High
Multi-Cloud Support Excellent Excellent None Medium
Programming Flexibility Limited Excellent Limited Medium
Community Support Excellent Growing AWS-specific High
Enterprise Features Available Available Built-in Medium

Strategic Recommendations

For Cloud-Heavy Network Automation:
  • AWS-Only: CloudFormation for simplicity, Terraform for multi-cloud

  • Multi-Cloud: Terraform for consistency across providers

  • Complex Logic: Pulumi for sophisticated orchestration requirements

For Hybrid Network Environments:
  • Use IaC for Infrastructure Layer: Cloud networking, security groups, load balancers

  • Complement with Network Tools: Ansible/vendor tools for device configuration

  • Plan Orchestration Strategy: Business workflow coordination across IaC and network tools

Implementation Success Factors:
  • Start with Cloud Resources: Build IaC expertise on well-supported cloud providers
  • Assess Provider Quality: Thoroughly evaluate third-party network providers before adoption
  • Plan State Management: Implement robust backend configuration and team workflows
  • Design for Day-2 Operations: IaC handles provisioning, plan for ongoing operational tasks
  • Train Teams Appropriately: Match tool complexity to team capabilities and learning capacity

References

Enterprise Network Architecture. (2024). Multi-vendor network automation challenges. Industry analysis report.
HashiCorp Discuss. (2023, July 4). 3rd party provider quality control issues. Retrieved from https://discuss.hashicorp.com/t/3rd-party-provider-quality-control-issues/55693
Infrastructure Automation Adoption. (2024). Programming-based IaC adoption patterns. Research survey results.
Network Service Delivery. (2023). Network automation implementation complexity. Operations research study.
PaloAltoNetworks. (2025). Terraform provider for PAN-OS. Retrieved from https://github.com/PaloAltoNetworks/terraform-provider-panos
Stanley, S. (2025, January 5). The impact of infrastructure automation part 1. DEV Community. Retrieved from https://dev.to/574n13y/the-impact-of-infrastructure-automation-2n24