Managing Network Interface States

When I first started doing network automation during my trip down consulting lane a few years ago, the idea of configuring interfaces was…contentious. Depending on the types of devices in my inventories, and the spread of potential interface sources (or lack thereof), I was genuinely anxious at the thought of interface discussions.

From my experience, you can have a nearly limitless list of configuration commands that you need to add/remove/verify, command/config sources are often whatever is running on a production device, and you almost certainly have to deal with endless command lists that are slightly different between every vendor and device. And obviously hundreds or thousands of interface templates.

And, of course, Regex. Dread it, run from it, regex is still the quickest way to parse text.

If this fresh hell sounds familiar, then you’ll be pleased to know that there’s a better way: state management. No more rigorous output inspection, no more wondering what commands to run and in which order to run them…

Ansible Network Resource Modules are the solution to managing device state across different devices and even different device types.

Resource Modules already have the logic built in to know how config properties need to be orchestrated in which specific ways, and these modules know how to run the behind-the-scenes commands that get you the desired configuration state.

For a deep dive into Resource Modules, my friend Trishna did a wonderful talk at Ansiblefest 2019.

As a practical example, here’s a short snippet of an interface variable template:

interface_config:
- interface: Ethernet1/1
  description: ansible_managed-Te0/1/2
  enabled: True
  mode: trunk
  portchannel_id: 100

- interface: Ethernet1/2
  enabled: False

...

- interface: port-channel100
  description: vPC PeerLink
  mode: trunk
  enabled: True
  vpc_peerlink: True
  members:
    - member: Ethernet1/1
      mode: active
    - member: Ethernet1/36
      mode: active

Using the new network resource modules, we simply define our interface properties, and Ansible will figure out the rest:

- name: Configure Interface Settings
nxos_interfaces:

config:
name: "{{ item['interface'] }}"
description: "{{ item['description'] }}"
enabled: "{{ item['enabled'] }}"
mode: "{% if 'ip_address' in item %}layer3{% else %}layer2{% endif %}"
state: replaced
loop: "{{ interface_config }}"
when: (interface_config is defined and (item['enabled'] == True))

In the example above, the new interface modules will look at an interface config template and determine if it needs to be enabled. If so, it will loop through each interface and begin setting those config values. You’ll do the same sort of thing for your VLANs/Trunks, VPCs, Port Channels, etc…

- name: Configure Port Channels
nxos_lag_interfaces:
config:
- name: "{{ item['interface'] }}"
members: "{{ item['members'] }}"
state: replaced
loop: "{{ interface_config }}"
when: ('port-channel' in item['interface'] and ('members' in item))

And if the nxos_interfaces configs looks familiar, that’s because they are! It’s the same thing as what you would get from nxos_facts parsing the interfaces section:

- name: gather nxos facts
nxos_facts:
gather_subset: interfaces

If you do it right, you can now take interface facts and pass them right back into Ansible as configuration properties!

ansible_facts:
  ansible_net_fqdn: rtr2
  ansible_net_gather_subset:
  - interfaces
  ansible_net_hostname: rtr2
  ansible_net_serialnum: D01E1309…
  ansible_net_system: nxos
  ansible_net_model: 93180yc-ex
  ansible_net_version: 14.22.0F
  ansible_network_resources:
    interfaces:
    - name: Ethernet1/1
      enabled: true
      mode: trunk
    - name: Ethernet1/2
      enabled: false 

Looks awfully familiar to what we started with up top, eh? Config to code, and vice versa!

Automating Networks with Ansible – Part 3

In part 1, we covered why to use Ansible. In part 2, we covered how to start using Ansible. So far, we’ve installed Ansible, setup a network inventory, and ran a playbook that gathers info and “facts” from our network inventory.

But what practical things can we actually do with all of this info we have now?

Good news! We can use our fact collection role as the foundation for everything we do and build next. Everything that Ansible can be configured or orchestrated to do will always involve variables, and we’ve just given ourselves a literal dictionary worth of automation logic to use!

Ansible Fact Gathering

Let’s take a step back for a moment and talk about what this fact gathering thing is all about.

The fact role I use does two things. First, I use Ansible’s native configuration parsers, Network Resource Modules (more on that later), to parse the raw device config. Second, I use custom facts that I set from running ad-hoc commands.

In a mixed version/device environment where fact modules can’t run against all devices, or if you just need to expand your playbook functionality, you can parse the running config to set custom facts.

As an example, the ios_command module will send commands, register the CLI output, find a specific string, and set it to a custom fact. Command modules are used to send arbitrary commands and return info (e.g., show run, description) — they cannot make changes the running config.

---
- name: collect output from ios device
  ios_command:
  commands: 
    show version
    show interfaces
    show running-config
    show ip interface brief | include {{ ansible_host }}
  register: output

- name: set version fact
  set_fact:
    cacheable: true
    version: "{{ output.stdout[0] | regex_search('Version (\S+)', '\1') | first }}"

- name: set hostname fact
  set_fact:
    cacheable: true
    hostname: "{{ output.stdout[2] | regex_search('\nhostname (.+)', '\1') | first }}"

- name: set management interface name fact
  set_fact:
    cacheable: true
    mgmt_interface_name: "{{ output.stdout[3].split()[0] }}"

- name: set config_lines fact
  set_fact:
    config_lines: "{{ output.stdout_lines[1] }}"

This playbook will run four commands against an IOS host:

1. show version
2. show interfaces
3. show running-config
4. show ip interface brief | include {{ hostname }}

Ansible will then search, parse, split, or otherwise strip out the interesting information, to give you the following facts:

1. version: "14.22.0F"
2. hostname: "hostname"
3. mgmt_interface_name: "int 1/1"
4. config_lines: "full running config ..."

Using Facts as Logic and Conditionals

Let’s take a look at a real world example. Jinja templates are the bread and butter of Ansible configuration, and we can use device variables to determine how which devices get which configs. Everything we picked up during our fact collection run is fair game.

For example, our fact collection playbook gathered this fact:

ansible_net_version: 14.22.0F

We can use that fact in a playbook, to determine whether to place a specific configuration based on the firmware version.

Here’s an example of a Cisco AAA config template that uses the OS/firmware version as the primary way to determine which commands to send:

{% if ansible_net_version.split('.')[0]|int < 15 %}
  aaa authentication login default group tacacs+ line enable
  aaa authentication login securid group tacacs+ line enable
  aaa authentication enable default group tacacs+ enable
  ...
{% endif %}

{% if ansible_net_version.split('.')[0]|int >= 15 %}
  {% if site == "pacific" %}
    aaa authentication login default group pst line enable
    ...
   {% elif site == "mountain" %}
    aaa authentication login default group mst line enable
    ...
   {% endif %}
 {% endif %}

In the playbook above, our first if statement is splitting the firmware version (ansible_net_version) variable into groups at decimals, registering the first group of numbers ([0]) as integers, and determining if that number is less than 15. Version 14 will match the first config stanza, and it will apply that group of configuration lines to that device.

However, if our firmware version matches 15 and above, then Ansible will apply the second config stanza instead. In this case, this scenario tackles the different configuration and command syntaxes that differ between newer and older devices.

Depending on the complexity of your particular network, logic and conditional checks like this will be come invaluable. And if this all makes sense up to this point, then congratulations, you’re well on your way to automating your network!

Automating Networks with Ansible – Part 2

Getting Started with Ansible

Ansible doesn’t have a steep learning curve and it doesn’t require any sort of programming background to use. You can begin running commands against your network inventory in no time at all. And I can prove it!

This is all using network devices as examples, but it’s all general Ansible stuff that we’ll be doing. This next section will overview how to start using Ansible. Download and install it, make an inventory, and then run a playbook against your network — in less than five minutes!

Step One: Installing Ansible and Git

Along with Ansible. we’ll be using Git. Git is a version control system. We will use it as a code repository for storing and controlling access to our network automation playbooks.

Fedora
  dnf install ansible git

CentOS/RHEL
  yum install ansible git

Mac/PIP
  pip install ansible

Ubuntu
  apt update
  apt install software-properties-common
  apt-add-repository --yes --update ppa:ansible/ansible
  apt install ansible
  apt install git

After installation, verify that you can run Ansible:
ansible --version

Full download/install instructions can be found here:
https://docs.ansible.com/ansible/latest/installation_guide/intro_installation.html

Step Two: Create an Inventory

Now that we have Ansible installed, let’s create our inventory that Ansible will use to connect to our hosts. To keep it simple, let’s just start with a small INI file, and a few test devices with the OS they’re running and the user/pass we’ll need to login.

In the host file you create, you’ll have one inventory host per line that defines these variables needed for Ansible to run.

1. ansible_hostname = hostname_fqdn
2. ansible_network_os = ios/nxos
3. ansible_username = username
4. ansible_password = password

Name this file inventory.

[all]
hostname_fqdn  ansible_network_os=ios  ansible_username=<username>  ansible_password=<password>
hostname_fqdn  ansible_network_os=nxos  ansible_username=<username>  ansible_password=<password>

We’ll make a better inventory later. For now, this is as simple as it gets, and this will allow us to immediately begin connecting to and managing our network devices. With Ansible installed, and with our inventory setup with the username, password, and host OS, we’re ready to run something!

The full list of network OS’ can be found here: https://github.com/ansible/ansible/blob/devel/docs/docsite/rst/network/user_guide/platform_index.rst

Verify: Ansible Installed; Inventory Created; Repo Ready

At this point you, you should be able to run Ansible, and you should have an inventory file. Verify that you have both:

ansible --version
file inventory

Now, we need something to run! Since our goal is to begin managing our network devices, then the perfect place to start is at Fact Collection.

In Ansible, facts are useful variables about remote hosts that can be used in playbooks. And variables are how you deal with differences between systems. Facts are information derived from speaking with remote devices/systems.

An example of this might be the IP address of the remote device, or perhaps an interface status or the device model number. Regardless, this means that we can run any command, save that output as a fact, and do something with it…

For instance, we can run a command like show version, and use the output to identify the firmware version. Beyond that, the possibilities are limitless! We can use any device information we can get our hands on.

Step Three: Run a Playbook

To get us started with fact collection, here’s a Git repo with my Ansible playbooks I use to gather facts and configs on all of my random network devices:
https://github.com/harrytruman/facts-machine

Before we can use it, we need to clone this repo somewhere for Ansible to run it:

git clone https://github.com/harrytruman/facts-machine

This will create a directory called facts-machine. Within that repo, I have my Ansible config (ansible.cfg) set to look for either an inventory file or directory called “inventory.” Keep it simple.

Move your inventory into this that directory, and run the fact collection playbook!

cp inventory facts-machine
ansible-playbook -i inventory facts.yml

This will run a playbook that will gather device info — and the full running config for every device in your inventory. This role will connect to these devices:

ansible_network_os:
  eos
  ios
  iosxr
  nxos
  aruba
  aireos
  f5-os
  fortimgr
  unos
  paloalto
  vyos

Every Config…from Every Device!

In one felt swoop, you suddenly have a backup of every network config…from every device! Ansible Facts will be available at the end of the playbook run.

ansible_facts:
  ansible_net_api: cliconf
  ansible_net_fqdn: rtr1
  ansible_net_gather_subset:
  - all
  ansible_net_hostname: rtr1
  ansible_net_image: flash:EOS.swi
  ansible_net_model: vEOS
  ansible_net_python_version: 2.7.5
  ansible_net_serialnum: D00E130991A37B49F970714D8CCF7FCB
  ansible_net_system: eos
  ansible_net_version: 4.22.0F
  ansible_network_resources:
    interfaces:
    - enabled: true
      name: Ethernet1
      mtu: 1476
    - enabled: true
      name: Loopback0
  Etc… etc… etc…

Part 3: https://www.landoman.com/2020/02/09/automating-networks-with-ansible-part-3/

Automating Networks with Ansible – Part 1

Configuring switches and routers, in theory, is a simple thing. In my case, it was the first “real” thing I did outside of fiddling with the desktop PCs of my childhood. A family friend ran a dial-up ISP out of our basement, and I somehow ended up learning about BGP routes, and troubleshooting T1 connectivity problems in the middle of the night. I was hooked, and I’ve been working with networks and servers ever since.

Fast-forward 20-years later, to present day. For a technology that so rarely changes, you would think that network devices would have been the first piece of the IT stack to get automated on a large scale. Things that rarely change are usually at the top of list. Yet, believe it or not, the ol’ reliable method that I mastered in the late-90s — copy/pasting from docs and spreadsheets — is still the main source of network automation.

But…why?

Every Network is Different

If you’ve ever attempted network automation before, then you know all too well why it’s so often a mind-numbingly frustrating effort: every vendor, every model, every device type…each has totally different commands, configurations, language/syntax, and firmware/operating systems.

Every network, and every device in those networks, is a hodge-podge combination that’s unique and often vastly different from place to place. In general, networks are logistical nightmares with a seemingly infinite set of random devices generating random output.

Take Cisco IOS, for instance. If you start going back a few years, you’ll eventually end up with older versions that begin having slightly different command syntaxes, standard output, and terminal lengths. And outdated SSH versions further complicate matters, as that requires you to subvert basic connection security.

And that doesn’t even include the challenges of identifying inventory sources and establishing a network source of truth. To have even that simple starting point for an automation project is often enough to fold hardened developers who aren’t already familiar with the ins and outs of network infrastructure.

Either way you cut it, your configuration and implementation options will be slightly different between each device, on each OS, on each firmware version, on each platform, etc… You need a tool that can connect to all of them, and give you standardized configurations and outputs for each of these different network devices.

How Did I Get Here?

A few years ago, I started a project to establish a network automation platform for a huge company. They had 15,000+ routers, switches, firewalls, and load-balancers. This particular network was spread across the globe in a combo of datacenters, support centers, offices, stores, warehouses…it could reliably be anything, anywhere, running any version of who knows what.

Although I started my tech career doing networking, I ended up becoming a Linux engineer after a fateful bait-and-switch with a government job back in 2010. That worked out quite well, actually, as it was a cool gig that introduced me to all sorts of things I’d have never done otherwise. Incidentally, I fell in love with Linux, and ended up finding Ansible a few years later.

Through the years, I ended up being able to blend my career into a combo if Linux, VM/containers, and network architecture. Anything and everything infrastructure. So when I got my first chance to tackle that huge network automation project, I was terrified and excited both. I dreaded the idea of nearly endless variance in gigantic networks, but I couldn’t wait to see what my time as a Linux engineer had taught me about automating network devices.

I knew that the sheer scale and variety of devices was going to be insane. But that’s where Ansible comes in! And let me tell you, it was an amazingly fun and challenging endeavor. I still remember the giddy feeling of running my first command against every device at the same time. And things have only gotten easier since then!

So Why Ansible?

So why Ansible? It’s lightweight and easy to learn. You can have it up and running in less than five minutes. There’s no agent to install or manage. It does its configuration over SSH and HTTPS. Blah blah blah.

I may get in trouble for saying this, but Ansible is as close to a replacement to programming as you can get. It’s automation for everyone. From people who don’t know how to program, to people that do…and to people who don’t want to know how to program!

Imagine, if you will, that you’re me from 20 years ago — new to tech and new to the idea of automation. For people like me, who want to start automating their everyday things, I’ll likely want to start with all the stuff I’m copy/pasting from a Word doc into a device terminal. Nobody should need to learn a new programming language just to start automating things.

IT tools for the masses are all but dead in the water if they require in-depth programming knowledge to even begin understanding how they work. Puppet is borderline, with its nightmare learning curve. And don’t even bother thinking about Chef unless you already fully competent with Ruby. I say this having been a former user/developer with both — never again!

The beauty of Ansible is that you can have an entire team of people pick it up and start using it in almost no time, regardless of how new or experienced they may already be. If you want to quickly learn how to automate things, look no further!

Getting Started with Ansible

Ansible doesn’t have a steep learning curve and it doesn’t require any sort of programming background to use. You can begin running commands against your network inventory in no time at all. And I can prove it!

This next section will overview how to start using Ansible. Download and install it, make an inventory, and then run a playbook against your network — in less than five minute!

Part Two:
http://www.landoman.com/2020/02/07/getting-started-with-network-automation-part-2/