There are many different components involved in network automation, including scripting languages, automation platforms, application programming interfaces (APIs), operating systems, data modeling languages, templating languages, system admin tools, and DevOps methodologies. As noted by David Barroso, creator of NAPALM (Network Automation and Programmability Abstraction Layer with Multivendor Support – a Python library for managing network devices): “Network automation is not just a single discipline; it is a collection of protocols, tools, and processes that can be overwhelming to the uninitiated.”
Network automation has been gaining momentum and increasing adoption. Most operations engineers are doing some form of task automation, such as writing scripts to automate a single activity such as a configuration change. While this type of narrow-scope automation is worthwhile, it has marginal benefit. Task automation provides modest increases in productivity at the individual engineer level but has minimal impact on the overall business because it fails to incorporate a broad enough set of activities to reduce overall effort and end-to-end process execution times.
As most network teams attempt to expand their automation strategies by shifting tools from a scripting focus to an orchestration-centric approach, which can entail a significantly larger investment in money and time. This paper outlines an alternative approach to developing a network automation strategy, one that is devised top-down, based on use cases and workflows, rather than bottom-up, based on the tools and systems needed to automate such workflows.
Network automation is about simplifying the tasks involved in configuring, managing, and operating network equipment, network topologies, network services, and network connectivity.
Jason Edleman, Network Programmability and Automation
Network automation use cases include:
- Deployment: Using tools to collect configuration variables and apply configuration to newly-deployed network nodes, enabling them to communicate on the network.
- Provisioning: Using templates with specific configuration parameters to generate and apply changes in order to enable a network service.
- Version Management: Using tools to automate the distribution and implementation of new software versions, point releases, and security patches.
- Data Collection: Using tools to pull the operational run-time data from network nodes to augment data provided by existing monitoring tools.
- Configuration Compliance: Auditing a configuration for adherence to security and business policies and automatically triggering remediation if there is a compliance violation.
- Troubleshooting: Automating the analysis of fault and performance data to find root causes more quickly and reliably, resolve those issues that can be addressed via automation, and to predict future issues.
Top-Down vs. Bottom-Up Approach to Automation
Top-Down Beats Bottom-Up
Top-down and bottom-up are contrasting methods for designing systems for the management of people, networks and countless other fields. With a top-down approach, we first construct an overview of the system and then specify the first layer of subsystems. Each subsystem is then defined in more detail, with lower-level components specified until we reach a useful level of atomicity. Essentially, the top-down approach starts with the big picture and then deconstructs it into finer elements.
In contrast, a bottom-up approach starts with subsystems and bolts them together to form more complex systems. Nature tends to take a bottom-up approach to system design, as it has the luxury of long-time scales and natural selection to form good systems. Conversely, engineering tends to take a top-down approach. A top-down approach will often lead to a better outcome than bottom-up, as it optimizes the overall solution to meet a global objective, rather than individual components, which only optimize at a local level.
Although network teams are steeped in engineering talent, when it comes to developing a network automation strategy, they often inadvertently end up taking a bottom-up approach, because they may have already made choices about subsystems such as network devices, controllers, and orchestrators before they begin to think about automation more holistically. Network teams often start their engineering projects by selecting networking equipment; then add some management and automation systems on top (often from the equipment vendor); and then take those automated systems and look for ways to stitch them together.
Deferring the more important strategic decisions about the automation strategy can often come back to haunt automation projects, leading to compromises having to be made or projects simply getting bogged down in analysis paralysis. By taking a bottom-up componentized view, you don’t understand what you are automating until you have finished. As a result, you may end up regretting decisions made earlier (at the instrumentation level) in your project because you end up with suboptimal automation.
Top-Down Approach Requires a Network Integration Automation Platform
The top-down approach requires the ability to visualize and design automations and the capability to integrate across a diverse set of network technologies, IT systems, and CI/CD tools. While generic workflow tools exist, a domain-specific tool will understand industry-specific concepts such as topology and languages such as YANG, YAML and Tosca, and they lack the ability to quickly integrate with north and southbound systems, which are essential sources of information.
Northbound integrations are key to providing the benefits of automation to the entire company, not just operations and engineering.
The automation platform connects northbound to:
- IT Service Management (ITSM) Systems (i.e., business process management tools that automate human processes)
- Ticketing & Change Management Systems (ServiceNow, Remedy, ManageEngine)
- Inventory & Configuration Management Databases (including IP Address Management)
- Customer Relationship Management (CRM) Systems (Salesforce, Jira)
- DevOps/SRE Platforms and Tools such as GitLab, GitHub, Jenkins, Ansible, Terraform, and Python
Southbound, the automation platform connects to:
- Cloud Platforms and Controllers (AWS, Azure, GCP, Alkira, Aviatrix, etc.)
- Data Center (Cisco, Juniper, Arista, VMWare NSX, etc.)
- SD-WAN/SASE (Versa, VeloCloud, Viptella, Silverpeak, etc.)
- Campus (Dell, HP, Aruba, Juniper, Cisco, etc.)
- Security (Palo Alto, Fortinet, Zscaler, Tufin, etc.)
- Network Services (Infblox, NetBox, Bluecat, F5, etc.)
However, domain-specific orchestrators may not have sufficient oversight to manage the automation of services that span multiple domains. For example, to automate a service turn-up might require the use of separate controllers for a firewall, a router, and an SD-WAN system, all of which need to be integrated.
Even multi-domain orchestrators may fall short in their ability to automate processes end-to-end. As network automation evolves, it should not be tied to a specific application, controller or orchestrator, but should become a platform that is deployed horizontally across the network, end-to-end. As this platform accesses multiple data sources, it can become a powerful agent of change, breaking traditional boundaries inside operators between networking and IT, and fostering collaboration. Initial efforts may target simple use cases such as TCP optimization, backhaul management or customer care. Once the operator is confident of the new approach, the platform can be expanded to more robust use cases that leverage the same data sources.
Automation Is More Than Orchestration
For a long time, network teams have used scripting for task and device-level management (identified as Stage 2 in the diagram below). Some teams have progressed from device-level to service automation with tools designed to manage state-dependent use cases. However, the evolution from stage 2 to stage 3 (Process Orchestration) requires more than configuration state, it requires the ability to integrate with all of the systems and users in the process. In some cases, such end-to-end automation might be overkill, and device-level management with scripting will suffice. There is no one-size-fits-all answer. Network teams must look for the right technology to address their particular environment and challenges, using the right tool for the right job.
Often organizations will start an automation project by implementing a comprehensive network orchestration platform. Sometimes, once they have automated and instrumented the various workflows, they realize that they could have achieved 80 percent of the functionality with an open source tool such as Ansible for a lot less money. Only 20 percent of the use cases required the functionality of the networking-specific orchestration platform.
Orchestration facilitates managing network services but does not include all the operational components that contribute to operational cost, such as finding ports, cards and racks, or coordinating a change control to upgrade a branch. Equally, ITSM tools can push change requests to the network, but there are many other processes that need to be executed. An intelligent network automation platform automates activities outside the scope of orchestration and ITSM, such as cross-domain (network/IT/DevOps/cloud) coordination activities through API federation. A lot of the cost associated with automation projects relates to integration with your systems. With a federated layer of APIs, this integration can be done once and reused multiple times.
In practice, organizations often end up with a multiplicity of orchestrators and controllers in their networks driven by distinct use cases, e.g., IP networking, SD-WAN, multi-cloud networking, and data center networking. If these tools are from different vendors, this can lead to a significant integration tax for the IT department. By using an automation platform that sits above the orchestration layer, organizations can avoid this manual integration effort.
Workflow Documentation Is Key
Network teams should start their automation projects by evaluating target use cases and processes, forecasting the business impact of automating those processes, then selecting the tools that are required to achieve these goals, rather than the other way around.
- If you need to maintain a service lifecycle, then yes, you need a robust tool such as a controller.
- If you are doing something simple, such as software provisioning, configuration management or application deployment, then use Ansible or off the shelf packaged software automation solutions.
- If you have previously written code, then a Python script may suffice.
- If you need to coordinate activities across multiple technology domains, or to utilize information from the network or IT systems within the process, then an integrated automation platform is the best choice.
User stories should be the guide as to which tools are selected downstream of the workflow engine. Organizations must understand the existing, often manual, workflows before attempting to automate. As such, engineers must document these workflows properly, identify interdependencies before making changes with inadvertent consequences, and redesign them to take advantage of programmability such that the new automated process is better than the traditional, manual process.
Automation projects often start by looking at the existing way of doing things manually and then trying to automate each step. That avoids the cultural challenge of changing procedures and allows the engineers to maintain control. The next step is to look for new ways to add value or functionality that wasn’t possible with manual processes as well as condensing or eliminating some of these excessive steps.
The automation design should start with high-level business “intent” and then work its way down to more granular specifications. Usually that is how automation projects start out, with the executive sponsor (budget holder) asking to reduce network operations costs or enable a new business model of dynamic enterprise services (on-demand, self-serve, pay as you go, etc.). Often, however, engineering then starts building the solution from the bottom up (or perhaps the middle), based on an orchestration platform they have chosen or are evaluating. Instead, they should continue the top-down process step by step, getting closer to the network gradually.
Evolving Your Automation Journey
Automation is not a single step; it is a journey. But unless network engineers document their workflows, how can they evaluate their degree of automation? A network engineer might have automated all of its command-line interface (CLI) inputs, only to realize that these represent just 10 percent of a multi-step configuration change process. For example, a Layer 2 VPN provisioning process could be automated with an orchestrator, but engineers may still need to manually open a firewall port and update an inventory system. Once we have a holistic understanding of our end-to-end processes, we must decide what degree of automation is appropriate. For example, the last 25 percent to reach closed-loop or zero-touch automation may not be cost-effective or reliable.
Service provider and large enterprise networks are complex. With so many variables, zero-touch automation presents a significant challenge. As Murphy’s law states: “Anything that can go wrong will go wrong.” Humans are on hand to deal with unexpected exceptions and keep the networks running while trying to minimize the impact on customer experience.
Automating manual back-office tasks within IT operations with a technology such as robotic process automation (RPA) is straightforward. And automating low-level functionality in the network is simple with scripts. But when you try to automate the management of network services over time, you get complex feedback loops with many interactions, which makes simple, rules-based systems hard to implement. Today we still require a human that can oversee the lifecycle of network services and guide the system down the right path when it gets stuck or confused. Perhaps we can reliably automate 80 percent of the service lifecycle, but in 20 percent of cases human intervention is required.
With the introduction of machine learning, every time the human intervenes to guide or override the automation system, this should be fed back into the algorithm to improve it. Learning is a continuous process, both to optimize existing workflows, and to keep up with changes in the network. The number of process variations is so large, it will likely take many years before we converge on an automation level of 99 percent. Zero-touch automation is perhaps an asymptotic goal.
Benefits of Automation
The benefits of network automation are clear: accelerating operations, reducing cycle times, improving productivity, and reducing errors. Automation reduces the time needed for deployments and configuration changes, so that communications service providers and enterprises can be more responsive to customers and deliver services faster. Automation is also about reducing errors that impact customer experience.
Around a quarter of customer-care issues end up being directed to engineering and operations staff (the bulk of enquiries are billing- and planning-related). These technical teams spend a lot of time trying to resolve issues, using multiple tools trying to find the root cause, and work out how to fix it. Such manual troubleshooting is time-consuming, which is frustrating for the customer. By automating many of the data-gathering elements of such troubleshooting procedures, service providers and enterprises can shorten the time to resolution and achieve greater consistency.
Different engineers working on the same network will often implement changes via CLI in slightly different ways, and if they are copying and pasting changes, it is easy to introduce an error. By automating, we can make configuration changes more consistent and reduce the scope for human errors. Fewer errors translate to lower operating cost, but automation can also lead to lower capital expense, as it can increase the reliability of the network, so network planners don’t have to rely on massively overprovisioning to compensate for uncertainty.
Networking teams have been on a journey from manual operations to automation for more than a century. What’s changed recently is the increasingly dynamic nature of the network that comes with the move to virtualization and cloud adoption. However, as technology domains shift to support agile-based methodologies, DevOps processes, and programmability, it is imperative that the network undergoes this same transformative shift. It is crucial to embrace this new mindset focused on supporting digital transformation while providing the most flexible network infrastructure possible. New thought processes must be implemented that consider intent and focus on innovation and re-usability. Network operators must push their vendors to embrace this vision and provide network components and management tools with modern, standards-based APIs.
The key to developing a good automation strategy is to take a top-down approach. Invest time in documenting existing processes with an intelligent tool, and then let those automation stories guide which technology components (controllers, orchestrators, etc.) to select, not vice versa.
Focus on Use Cases Rather Than Technology Components
- Which groups (engineering/operations) will be impacted?
- What systems are involved?
- What is the service or goal?
- How many integrations are required?
Start with your business intent (deliver a service, be more secure/compliant, reduce maintenance spending, etc.) and then build the automations and technology components to support it. Do not obsess with automating one particular part of the process (e.g., CLI input) while forgetting to automate other, more onerous manual tasks that hold back agility and time to market.
Automation is an iterative and incremental process, not a monolithic solution. You will only learn the lessons of automation in your network environment through doing. Start with manageable automation projects, rather than trying to boil the ocean. Create an automation “on-ramp” with measurable benefits that leverages an integrated automation ecosystem and DevOps practices and tools.
Treat Your Infrastructure as Code and Make a Plan to Automate It
- Where do we store configuration/metric data?
- Where do we store code/models/scripts?
- How do we manage code/configuration lifecycle?
- How do we test?
- How fast can we deploy updates?
The days of network teams being the bottleneck to IT change are now over. Organizations that realize the advantages of software-driven network automation technologies with closed-loop automation capabilities have a tremendous market advantage during this age of digital transformation and IoT expansion.
Achieve End-to-End Network Automation with Itential
Itential is purpose-built for today’s complex and distributed networks. Our powerful network automation software is used within some of the largest networks in the world to automate complex, business-critical changes at breakneck speeds, from Fortune 500 telecommunications and financial service companies to enterprises of all sizes. Our world-class products accelerate the move toward software-driven networks and next generation, agile network operations.
The Itential Automation Platform is an easy to use, scalable, and feature-rich network automation solution for hybrid network infrastructure. Itential is a vendor agnostic solution that seamlessly connects disparate systems such as IT Service Management, inventory, analytics and orchestration tools for end-to-end and closed-loop network automation capabilities.
Automate Any Network
Today’s networks require automation capabilities that span multiple domains from traditional physical networks, next-generation programmable networks, SD-WANs, cloud networks and more. With Itential, seamlessly automate across any network domain and any network vendor.
Automate Any Network Change
Automate any network change from routine operations tasks such as software upgrades to managing device configuration and compliance to service lifecycle and policy management. Itential’s purpose-built platform simplifies and automates network management, reducing the scope of human errors.
Connect to Anything with Itential’s Aggregated Network API
Itential’s patented integration capabilities simplify your network automation ecosystem by providing a single, aggregated network API that connects all of your IT systems with your orchestrators and controllers, configuration tools and custom-built scripts to enable true end-to-end network automation.
Deliver Self-Service Network Automation
Extend the scope of your end-to-end automations by integrating northbound with your end users, pipelines, or applications. With Itential, any workflow you build can be exposed for consumption.