How Do You Start Your Network Automation Adoption Journey?

A lot of people are lucky to see the inner workings of a single tech giant or Fortune 100 company. Red Hat has the pleasure of working with basically all of them. And I think that’s what I love most about being a consultant and architect at Red Hat — I get to speak to, and work with, so many different people and groups from all of the biggest companies all over the world.

I think it’s endlessly fascinating to hear about what everyone else out there is doing. And nowadays, I talk to all sorts of people about network automaton. This is the definition of a dream job!

After spending years building out massive networking automation projects with tens of thousands, and hundreds of thousands of devices, device management is easier than ever. Ansible has evolved staggeringly quick, and in a lot of ways, the actual device configuration is a problem that’s quickly being solved.

More than ever, the question I most often get is simply how do I get started doing a network automation project? Nowadays, the biggest hurdle is often just getting your head wrapped around the options, and ideas, and ways, about how to do things one way or another.

How do you begin even thinking about going from a small lab with a dozen devices…to some gigantic production network with thousands upon thousands of devices that you’re supposed to just…”manage and automate?”

From my perspective as an infrastructure architect, this is one of the best ways that I’ve found for people to begin managing their network in a practical way:

1. You should start your automation journey by gathering facts from everything on your network.

For example, this is my network fact role that I have been using for years. Anytime I come across a new network OS somewhere, I add it to the list:
https://github.com/harrytruman/facts-machine

Anyway, this role will give you parsed configs for everything Cisco/Arista/JunOS, and the raw config (show run all) for every other device it encounters:

Here’s more about how/why I do fact collection first:
https://www.landoman.com/2020/02/07/automating-networks-with-ansible-part-1/

2. Once you have fact collection running, then you’re ready to begin Ansible state/config management!

Building playbooks has never been quicker and easier. As of 2.9, Ansible’s network resource modules will let us do state management rather than just config mgmtIdentify variables, built templates, and the modules do the rest. This is the easiest way to build and implement backups/restores too.

The resource modules will determine which commands need to be sent and in which order, whether things need to be removed first, etc… The things that once took a year can be done in days or weeks now…

https://www.landoman.com/2020/02/25/managing-network-interface-states/

3. Next, use your network facts to build or enhance your CMDB!

Establishing a CMDB is the prerequisite to doing anything with Ansible long-term. With Tower as the API/UI around Ansible, I prefer pairing it with Elasticsearch [ELK] stacks to create a full-tilt CMDB and search engine combo.

This is all done through Ansible Facts and Tower logging — nothing else required. I gather facts against everything on the network, and I use playbooks to search Elasticsearch for that data from whatever time I’m interested in, so I can compare/diff or retrieve specific configs to be used as backups/restores.

Keep in mind that Tower itself is often not the best place to be doing heavy searching and log/job analysis. In general, we recommend you offload search and analytics to an external service. And at large scale — and certainly at high volume — facts and logging are the gateway to a big data project.

https://github.com/harrytruman/ansible-tower
https://github.com/harrytruman/elk-ansible

4. And now that we have all of these basic functions in place, it’s time to begin scale and performance testing. Part of this involves setting up a development and testing framework. The rest of it is purely and exercise in establishing standards that allow people to efficiently learn how to work with and create content using Ansible and Git.

Everything I’ve covered so far can be stood up and configured with basic functionality rather quickly. And at the very least, these specific tools and technologies will all scale with us as quickly as we can develop things to use them.

https://www.landoman.com/2020/02/10/scaling-ansible-and-awx-tower-with-network-and-cloud-inventories/

This all takes time to build out to full scale in a large network, but it’s a tried and true, and a practical way, to begin your network automation adoption.

This framework — and the fundamental objective of knowing what’s running on your network at any given time — has been implemented with tremendous success in every network infrastructure I’ve worked on.

The day one results are immediate, and the foundation for all of this can be built in the time it takes to do a POC. The fact collection and logging that we’re doing through Tower and ELK both lend themselves well to a quick implementation and gradual scale-up to running against massive inventories.

Managing Network Interface States

When I first started doing network automation during my trip down consulting lane a few years ago, the idea of configuring interfaces was…contentious. Depending on the types of devices in my inventories, and the spread of potential interface sources (or lack thereof), I was genuinely anxious at the thought of interface discussions.

From my experience, you can have a nearly limitless list of configuration commands that you need to add/remove/verify, command/config sources are often whatever is running on a production device, and you almost certainly have to deal with endless command lists that are slightly different between every vendor and device. And obviously hundreds or thousands of interface templates.

And, of course, Regex. Dread it, run from it, regex is still the quickest way to parse text.

If this fresh hell sounds familiar, then you’ll be pleased to know that there’s a better way: state management. No more rigorous output inspection, no more wondering what commands to run and in which order to run them…

Ansible Network Resource Modules are the solution to managing device state across different devices and even different device types.

Resource Modules already have the logic built in to know how config properties need to be orchestrated in which specific ways, and these modules know how to run the behind-the-scenes commands that get you the desired configuration state.

For a deep dive into Resource Modules, my friend Trishna did a wonderful talk at Ansiblefest 2019.

As a practical example, here’s a short snippet of an interface variable template:

interface_config:
- interface: Ethernet1/1
  description: ansible_managed-Te0/1/2
  enabled: True
  mode: trunk
  portchannel_id: 100

- interface: Ethernet1/2
  enabled: False

...

- interface: port-channel100
  description: vPC PeerLink
  mode: trunk
  enabled: True
  vpc_peerlink: True
  members:
    - member: Ethernet1/1
      mode: active
    - member: Ethernet1/36
      mode: active

Using the new network resource modules, we simply define our interface properties, and Ansible will figure out the rest:

- name: Configure Interface Settings
nxos_interfaces:

config:
name: "{{ item['interface'] }}"
description: "{{ item['description'] }}"
enabled: "{{ item['enabled'] }}"
mode: "{% if 'ip_address' in item %}layer3{% else %}layer2{% endif %}"
state: replaced
loop: "{{ interface_config }}"
when: (interface_config is defined and (item['enabled'] == True))

In the example above, the new interface modules will look at an interface config template and determine if it needs to be enabled. If so, it will loop through each interface and begin setting those config values. You’ll do the same sort of thing for your VLANs/Trunks, VPCs, Port Channels, etc…

- name: Configure Port Channels
nxos_lag_interfaces:
config:
- name: "{{ item['interface'] }}"
members: "{{ item['members'] }}"
state: replaced
loop: "{{ interface_config }}"
when: ('port-channel' in item['interface'] and ('members' in item))

And if the nxos_interfaces configs looks familiar, that’s because they are! It’s the same thing as what you would get from nxos_facts parsing the interfaces section:

- name: gather nxos facts
nxos_facts:
gather_subset: interfaces

If you do it right, you can now take interface facts and pass them right back into Ansible as configuration properties!

ansible_facts:
  ansible_net_fqdn: rtr2
  ansible_net_gather_subset:
  - interfaces
  ansible_net_hostname: rtr2
  ansible_net_serialnum: D01E1309…
  ansible_net_system: nxos
  ansible_net_model: 93180yc-ex
  ansible_net_version: 14.22.0F
  ansible_network_resources:
    interfaces:
    - name: Ethernet1/1
      enabled: true
      mode: trunk
    - name: Ethernet1/2
      enabled: false 

Looks awfully familiar to what we started with up top, eh? Config to code, and vice versa!

Debug and Benchmark Ansible Playbook Performance

Debugging and benchmarking can be heavy topics. Frankly, there are a lot of options when it comes to analyzing playbook performance. My goal will be to make this as simple as possible.

Playbook Debugging and Connection Logging

Ansible allows you to enable debug logging at a number of levels. Most importantly, at both the playbook run level, and the network connection level. Having that info will become invaluable with your future troubleshooting efforts.

From the Ansible CLI, you can set these as environment variables before running a playbook:

export ANSIBLE_DEBUG=true
export ANSIBLE_LOG_PATH=/tmp/ansible-debug.log
ansible-playbook … -v

-v will show increased logging around individual plays/tasks
-vvv will show Ansible playbook execution run logs
-vvvv will show SSH and TCP connection logging

And from within AWX/Tower, you can set the logging/debugging verbosity for each playbook job template:
tower job template verbosity

If standard debug options aren’t enough…if you want to truly see everything that’s happening with Ansible, say no more! You can keep remote log files and enable persistent log messages. And with great power comes great responsibility.

export ANSIBLE_KEEP_REMOTE_FILES=true
export ANSIBLE_PERSISTENT_LOG_MESSAGES=True

Both of these are super insecure. They will give you everything that Ansible does and sees — the deepest view of exactly what’s happening on a remote device. And you’ll expose vars/logs as they’re unencrypted and read. So do be careful!

On the network and non-OS side, I often enable ansible_persistent_log_messages to see netconf responses, system calls, and other such things from my network inventories.

2021-03-01 20:35:23,127 p=26577 u=ec2-user n=ansible | jsonrpc response: {"jsonrpc": "2.0", "id": "9c4d684c-d252-4b6d-b624-dff71b20e0d3", "result": [["vvvv", "loaded netconf plugin default from path /home/ec2-user/ansible/ansible_venv/lib/python3.7/site-packages/ansible/plugins/netconf/default.py for network_os default"], ["log", "network_os is set to default"], ["warning", "Persistent connection logging is enabled for lab1-idf1-acc-sw01. This will log ALL interactions to /home/ec2-user/ansible/ansible_debug.log and WILL NOT redact sensitive configuration like passwords. USE WITH CAUTION!"]]}

And then it comes to a traditional OS, setting ansible_keep_remote_files will allow you to see the equivalent levels of process events, system calls, and whatnot that have happened on the remote system.

Ansible and Network Debugging:
https://docs.ansible.com/ansible/latest/dev_guide/debugging.html
https://docs.ansible.com/ansible/latest/network/user_guide/network_debug_troubleshooting.html

Scale and Performance Testing

Now let’s talk benchmarking! We can easily get ourselves deep into the weeds here. So let’s start simple.

Simply noting Ansible CLI or Tower Job run times is the best place to start. Additionally, we can look at timing for individual tasks. To aid us in the process, there are a number of plugins that can be enabled both in Ansible.

Ansible is built around many types of plugins, one of which is callback plugins. Callback plugins allow you to add some very interesting capabilities to Ansible, such as making your computer read the playbook as it runs. Ansible ships with a number of callback plugins that are ready to use out-of-the-box — you simply have to enable them in your ansible.cfg. Add a comma separated list of callback plugins to callback_whitelist in your ansible.cfg.

The particular callback plugin that will help with performance tuning our playbooks is called profile_tasks. It prints out a detailed breakdown of task execution times, sorted from longest to shortest, as well as a running timer during play execution. Speaking of, timer is another useful callback plugin that prints total execution time, similar to time but with more friendly output.

Ultimately, let’s start with these. Edit your ansible.cfg file to enable these callback plugins:

callback_whitelist = profile_tasks, timer

With the profile_tasks and timer callback plugins enabled, run your playbook again and you’ll see more output. For example, here’s a profile of fact collection tasks on a single Cisco inventory host:

ansible-playbook facts.yml --ask-vault-pass -e "survey_hosts=cisco-ios"

ansible_facts : collect output from ios device ------------ 1.94s
ansible_facts : include cisco-ios tasks ------------------- 0.50s
ansible_facts : set config_lines fact --------------------- 0.26s
ansible_facts : set version fact -------------------------- 0.07s
ansible_facts : set management interface name fact -------- 0.07s
ansible_facts : set model number -------------------------- 0.07s
ansible_facts : set config fact --------------------------- 0.07s

And a profile of `change password` on a host:

ansible-playbook config_localpw.yml -e "survey_hosts=cisco-ios"

config_localpw : Update line passwords --------------------- 4.66s
ansible_facts : collect output from ios device ------------- 5.06s
ansible_facts : include cisco-ios tasks -------------------- 0.51s
config_localpw : Update line passwords --------------------- 0.34s
config_localpw : Update enable and username config lines --- 0.33s
config_localpw : debug ------------------------------------- 0.23s
config_localpw : Update enable and username config lines --- 0.22s
config_localpw : Update terminal server username doorbell -- 0.22s
config_localpw : Update line passwords --------------------- 0.20s
config_localpw : Update terminal server username doorbell -- 0.20s
config_localpw : Update terminal server username doorbell -- 0.19s
config_localpw : debug ------------------------------------- 0.19s
config_localpw : Identify if it has a modem ---------------- 0.14s
config_localpw : set_fact - Modem slot 2 ------------------- 0.11s
config_localpw : set_fact - Modem slot 1 ------------------- 0.10s
config_localpw : Update terminal server username doorbell -- 0.10s
config_localpw : set_fact - Modem slot 3 ------------------- 0.10s
config_localpw : Update line passwords --------------------- 0.10s
config_localpw : Update enable and username config lines --- 0.09s
config_localpw : Update enable and username config lines --- 0.09s

Automation will be unique to every organization, and it’s important to regularly track performance benchmarks as your roles evolves. Beyond the obvious benefit of being able to accurately estimate your automation run times, you can determine where improvements can be made while proactively monitoring for faulty code/logic that will inevitably slip through peer reviews.

Profiling tasks/roles:
https://docs.ansible.com/ansible/latest/plugins/callback/profile_tasks.html
https://docs.ansible.com/ansible/latest/plugins/callback/profile_roles.html

All plugins:
https://docs.ansible.com/ansible/latest/plugins/callback.html

Ansible CLI Process Monitoring

I use dstat when I’m trying to benchmark Ansible CLI performance, and it’s the first place I start when I need to troubleshoot performance.

dstat is an is an extremely versatile replacement for vmstat, iostat and ifstat — it’s an all-in-one stat collection tool. It will gather CPU, process, memory, I/O, network, etc… I use it to log/monitor any and all system stats related to Ansible/Tower.

dstat -tcmslrnpyi --fs --socket --unix --top-oom
   -t  timestamp
   -c total CPU usage
   -m memory usage
   -s swap space
   -l load average
   -r disk I/O
   -n network I/O
   -p processes
   -y linux system stats
   -i interrupts
   --fs filesystem open files and inodes
   --socket network sockets
   --unix unix sockets--top-oom watch for OOM process

Note: In RHEL8+, dstat is supplied by the pcp-system-tools package.

Scaling Ansible and AWX/Tower with Network and Cloud Inventories

This topic is covered more in-depth in my Red Hat Summit talk on Managing 15,000 Network Devices.

Quick primer: Ansible is a CLI orchestration application that is written in Python and that operates over SSH and HTTPS. AWX (downstream, unsupported) and Tower (upstream, supported) are the suite of UI/API, job scheduler, and security gateway functionalities around Ansible.

Ansible and AWX/Tower operate and function somewhat differently when configuring network, cloud, and generic platform endpoints, versus when performing traditional OS management or targeting APIs. The differentiator between Ansible’s connectivity is, quite frankly, OS and applications — things that can run Python — versus everything else that cannot run Python.

Ansible with Operating Systems

When Ansible runs against an OS like Linux and Windows, the remote hosts receive a tarball of python programs/plugins, Operating System, or API commands via SSH or HTTPS. The remote hosts unpack and runs these playbooks, while APIs receive a sequence of URLs. In either case, both types of OS and API configurations returns the results to Ansible/Tower. In the case of OS’ like Linux and Windows, these hosts process their own data and state changes, and then return the results to Ansible/Tower.

As an example with a Linux host, a standard playbook to enable and configure the host logging service would be initiated by Ansible/Tower, and would then run entirely on the remote host. Upon completion, only task results and state changes are sent back to Ansible. With OS automation, Tower orchestrates changes and processes data.

Ansible with Network and Cloud Devices

Network and cloud devices, on the other hand,  don’t perform their own data processing, and are often sending nonstop command output back to Ansible. In this case, all data processing is performed locally on Ansible or AWX/Tower nodes.

Rather than being able to rely on remote devices to do their own work, Ansible handles all data processing as it’s received from network cloud devices. This will have drastic, and potentially catastrophic, implications when running playbooks at scale against network/cloud inventories.

Ansible Networking at Scale — Things to Consider

In the pursuit of scaling Ansible and AWX/Tower to manage network and cloud devices, we must consider a number of factors that will directly impact playbook and job performance:

Frequency/extent of orchestrating/scheduling device changes
With any large inventory, there comes a balancing act between scheduling frequent or large-scale configuration changes, while avoiding physical resource contention. At a high level, this can be as simple as benchmarking job run times with Tower resource loads, and setting job template forks accordingly. This will become critical in future development. More on that later.

Device configuration size
Most network automation roles will be utilizing Ansible Facts derived from inventory vars and device configs. By looking at the raw device config sizes, such as the text output from show run all, we can establish a rough estimate of per-host memory usage during large jobs.

Inventory sizes and devices families, e.g. IOS, NXOS, XR
Depending on overall inventory size, and the likelihood of significant inventory metadata, it’s critical to ensure that inventories are broken into multiple smaller groups — group sizes of 500 or less are preferable, while it’s highly recommended to limit max group sizes to 5,000 or less.

It’s important to note that device types/families perform noticeably faster/slower than others. IOS, for instance, is often 3-4 faster than NXOS.

Making Use of Ansible Facts
Ansible can collect device “facts” — useful variables about remote hosts — that can be used in playbooks. These facts can be cached in Tower, as well. The combination of using network facts and fact caching can allow you to poll existing data rather than parsing real-time commands.

Effectively using facts, and the fact cache, will significantly increase Ansible/Tower job speed, while reducing overall processing loads.

Development methodology
When creating new automation roles, it’s imperative that you establish solid standards and development practices. Ideally, you want to outright avoid potentially significant processing and execution times that plague novice developers.

Start with simple, stepping through your automation workflow task-by-task, and understand the logical progression of tasks/changes.. Ansible is a wonderfully simple tool, but it’s easy to overcomplicate code with faulty or overly-complex logic.

And be careful with numerous role dependencies, and dependency recursion using layer-upon-layer of ’include’ and ’import’. If you’re traversing more than 3-4 levels per role, then it’s time to break out that automation/logic into smaller chunks. Otherwise, a large role running against a large inventory can run OOM simply from attempting to load the same million dependencies per host.

Easier said than done, of course. There’s a lot here, and to some extent, this all comes with time. Write, play, break, and learn!

Automating Networks with Ansible – Part 3

In part 1, we covered why to use Ansible. In part 2, we covered how to start using Ansible. So far, we’ve installed Ansible, setup a network inventory, and ran a playbook that gathers info and “facts” from our network inventory.

But what practical things can we actually do with all of this info we have now?

Good news! We can use our fact collection role as the foundation for everything we do and build next. Everything that Ansible can be configured or orchestrated to do will always involve variables, and we’ve just given ourselves a literal dictionary worth of automation logic to use!

Ansible Fact Gathering

Let’s take a step back for a moment and talk about what this fact gathering thing is all about.

The fact role I use does two things. First, I use Ansible’s native configuration parsers, Network Resource Modules (more on that later), to parse the raw device config. Second, I use custom facts that I set from running ad-hoc commands.

In a mixed version/device environment where fact modules can’t run against all devices, or if you just need to expand your playbook functionality, you can parse the running config to set custom facts.

As an example, the ios_command module will send commands, register the CLI output, find a specific string, and set it to a custom fact. Command modules are used to send arbitrary commands and return info (e.g., show run, description) — they cannot make changes the running config.

---
- name: collect output from ios device
  ios_command:
  commands: 
    show version
    show interfaces
    show running-config
    show ip interface brief | include {{ ansible_host }}
  register: output

- name: set version fact
  set_fact:
    cacheable: true
    version: "{{ output.stdout[0] | regex_search('Version (\S+)', '\1') | first }}"

- name: set hostname fact
  set_fact:
    cacheable: true
    hostname: "{{ output.stdout[2] | regex_search('\nhostname (.+)', '\1') | first }}"

- name: set management interface name fact
  set_fact:
    cacheable: true
    mgmt_interface_name: "{{ output.stdout[3].split()[0] }}"

- name: set config_lines fact
  set_fact:
    config_lines: "{{ output.stdout_lines[1] }}"

This playbook will run four commands against an IOS host:

1. show version
2. show interfaces
3. show running-config
4. show ip interface brief | include {{ hostname }}

Ansible will then search, parse, split, or otherwise strip out the interesting information, to give you the following facts:

1. version: "14.22.0F"
2. hostname: "hostname"
3. mgmt_interface_name: "int 1/1"
4. config_lines: "full running config ..."

Using Facts as Logic and Conditionals

Let’s take a look at a real world example. Jinja templates are the bread and butter of Ansible configuration, and we can use device variables to determine how which devices get which configs. Everything we picked up during our fact collection run is fair game.

For example, our fact collection playbook gathered this fact:

ansible_net_version: 14.22.0F

We can use that fact in a playbook, to determine whether to place a specific configuration based on the firmware version.

Here’s an example of a Cisco AAA config template that uses the OS/firmware version as the primary way to determine which commands to send:

{% if ansible_net_version.split('.')[0]|int < 15 %}
  aaa authentication login default group tacacs+ line enable
  aaa authentication login securid group tacacs+ line enable
  aaa authentication enable default group tacacs+ enable
  ...
{% endif %}

{% if ansible_net_version.split('.')[0]|int >= 15 %}
  {% if site == "pacific" %}
    aaa authentication login default group pst line enable
    ...
   {% elif site == "mountain" %}
    aaa authentication login default group mst line enable
    ...
   {% endif %}
 {% endif %}

In the playbook above, our first if statement is splitting the firmware version (ansible_net_version) variable into groups at decimals, registering the first group of numbers ([0]) as integers, and determining if that number is less than 15. Version 14 will match the first config stanza, and it will apply that group of configuration lines to that device.

However, if our firmware version matches 15 and above, then Ansible will apply the second config stanza instead. In this case, this scenario tackles the different configuration and command syntaxes that differ between newer and older devices.

Depending on the complexity of your particular network, logic and conditional checks like this will be come invaluable. And if this all makes sense up to this point, then congratulations, you’re well on your way to automating your network!

Automating Networks with Ansible – Part 2

Getting Started with Ansible

Ansible doesn’t have a steep learning curve and it doesn’t require any sort of programming background to use. You can begin running commands against your network inventory in no time at all. And I can prove it!

This is all using network devices as examples, but it’s all general Ansible stuff that we’ll be doing. This next section will overview how to start using Ansible. Download and install it, make an inventory, and then run a playbook against your network — in less than five minutes!

Step One: Installing Ansible and Git

Along with Ansible. we’ll be using Git. Git is a version control system. We will use it as a code repository for storing and controlling access to our network automation playbooks.

Fedora
  dnf install ansible git

CentOS/RHEL
  yum install ansible git

Mac/PIP
  pip install ansible

Ubuntu
  apt update
  apt install software-properties-common
  apt-add-repository --yes --update ppa:ansible/ansible
  apt install ansible
  apt install git

After installation, verify that you can run Ansible:
ansible --version

Full download/install instructions can be found here:
https://docs.ansible.com/ansible/latest/installation_guide/intro_installation.html

Step Two: Create an Inventory

Now that we have Ansible installed, let’s create our inventory that Ansible will use to connect to our hosts. To keep it simple, let’s just start with a small INI file, and a few test devices with the OS they’re running and the user/pass we’ll need to login.

In the host file you create, you’ll have one inventory host per line that defines these variables needed for Ansible to run.

1. ansible_hostname = hostname_fqdn
2. ansible_network_os = ios/nxos
3. ansible_username = username
4. ansible_password = password

Name this file inventory.

[all]
hostname_fqdn  ansible_network_os=ios  ansible_username=<username>  ansible_password=<password>
hostname_fqdn  ansible_network_os=nxos  ansible_username=<username>  ansible_password=<password>

We’ll make a better inventory later. For now, this is as simple as it gets, and this will allow us to immediately begin connecting to and managing our network devices. With Ansible installed, and with our inventory setup with the username, password, and host OS, we’re ready to run something!

The full list of network OS’ can be found here: https://github.com/ansible/ansible/blob/devel/docs/docsite/rst/network/user_guide/platform_index.rst

Verify: Ansible Installed; Inventory Created; Repo Ready

At this point you, you should be able to run Ansible, and you should have an inventory file. Verify that you have both:

ansible --version
file inventory

Now, we need something to run! Since our goal is to begin managing our network devices, then the perfect place to start is at Fact Collection.

In Ansible, facts are useful variables about remote hosts that can be used in playbooks. And variables are how you deal with differences between systems. Facts are information derived from speaking with remote devices/systems.

An example of this might be the IP address of the remote device, or perhaps an interface status or the device model number. Regardless, this means that we can run any command, save that output as a fact, and do something with it…

For instance, we can run a command like show version, and use the output to identify the firmware version. Beyond that, the possibilities are limitless! We can use any device information we can get our hands on.

Step Three: Run a Playbook

To get us started with fact collection, here’s a Git repo with my Ansible playbooks I use to gather facts and configs on all of my random network devices:
https://github.com/harrytruman/facts-machine

Before we can use it, we need to clone this repo somewhere for Ansible to run it:

git clone https://github.com/harrytruman/facts-machine

This will create a directory called facts-machine. Within that repo, I have my Ansible config (ansible.cfg) set to look for either an inventory file or directory called “inventory.” Keep it simple.

Move your inventory into this that directory, and run the fact collection playbook!

cp inventory facts-machine
ansible-playbook -i inventory facts.yml

This will run a playbook that will gather device info — and the full running config for every device in your inventory. This role will connect to these devices:

ansible_network_os:
  eos
  ios
  iosxr
  nxos
  aruba
  aireos
  f5-os
  fortimgr
  unos
  paloalto
  vyos

Every Config…from Every Device!

In one felt swoop, you suddenly have a backup of every network config…from every device! Ansible Facts will be available at the end of the playbook run.

ansible_facts:
  ansible_net_api: cliconf
  ansible_net_fqdn: rtr1
  ansible_net_gather_subset:
  - all
  ansible_net_hostname: rtr1
  ansible_net_image: flash:EOS.swi
  ansible_net_model: vEOS
  ansible_net_python_version: 2.7.5
  ansible_net_serialnum: D00E130991A37B49F970714D8CCF7FCB
  ansible_net_system: eos
  ansible_net_version: 4.22.0F
  ansible_network_resources:
    interfaces:
    - enabled: true
      name: Ethernet1
      mtu: 1476
    - enabled: true
      name: Loopback0
  Etc… etc… etc…

Part 3: https://www.landoman.com/2020/02/09/automating-networks-with-ansible-part-3/

Automating Networks with Ansible – Part 1

Configuring switches and routers, in theory, is a simple thing. In my case, it was the first “real” thing I did outside of fiddling with the desktop PCs of my childhood. A family friend ran a dial-up ISP out of our basement, and I somehow ended up learning about BGP routes, and troubleshooting T1 connectivity problems in the middle of the night. I was hooked, and I’ve been working with networks and servers ever since.

Fast-forward 20-years later, to present day. For a technology that so rarely changes, you would think that network devices would have been the first piece of the IT stack to get automated on a large scale. Things that rarely change are usually at the top of list. Yet, believe it or not, the ol’ reliable method that I mastered in the late-90s — copy/pasting from docs and spreadsheets — is still the main source of network automation.

But…why?

Every Network is Different

If you’ve ever attempted network automation before, then you know all too well why it’s so often a mind-numbingly frustrating effort: every vendor, every model, every device type…each has totally different commands, configurations, language/syntax, and firmware/operating systems.

Every network, and every device in those networks, is a hodge-podge combination that’s unique and often vastly different from place to place. In general, networks are logistical nightmares with a seemingly infinite set of random devices generating random output.

Take Cisco IOS, for instance. If you start going back a few years, you’ll eventually end up with older versions that begin having slightly different command syntaxes, standard output, and terminal lengths. And outdated SSH versions further complicate matters, as that requires you to subvert basic connection security.

And that doesn’t even include the challenges of identifying inventory sources and establishing a network source of truth. To have even that simple starting point for an automation project is often enough to fold hardened developers who aren’t already familiar with the ins and outs of network infrastructure.

Either way you cut it, your configuration and implementation options will be slightly different between each device, on each OS, on each firmware version, on each platform, etc… You need a tool that can connect to all of them, and give you standardized configurations and outputs for each of these different network devices.

How Did I Get Here?

A few years ago, I started a project to establish a network automation platform for a huge company. They had 15,000+ routers, switches, firewalls, and load-balancers. This particular network was spread across the globe in a combo of datacenters, support centers, offices, stores, warehouses…it could reliably be anything, anywhere, running any version of who knows what.

Although I started my tech career doing networking, I ended up becoming a Linux engineer after a fateful bait-and-switch with a government job back in 2010. That worked out quite well, actually, as it was a cool gig that introduced me to all sorts of things I’d have never done otherwise. Incidentally, I fell in love with Linux, and ended up finding Ansible a few years later.

Through the years, I ended up being able to blend my career into a combo if Linux, VM/containers, and network architecture. Anything and everything infrastructure. So when I got my first chance to tackle that huge network automation project, I was terrified and excited both. I dreaded the idea of nearly endless variance in gigantic networks, but I couldn’t wait to see what my time as a Linux engineer had taught me about automating network devices.

I knew that the sheer scale and variety of devices was going to be insane. But that’s where Ansible comes in! And let me tell you, it was an amazingly fun and challenging endeavor. I still remember the giddy feeling of running my first command against every device at the same time. And things have only gotten easier since then!

So Why Ansible?

So why Ansible? It’s lightweight and easy to learn. You can have it up and running in less than five minutes. There’s no agent to install or manage. It does its configuration over SSH and HTTPS. Blah blah blah.

I may get in trouble for saying this, but Ansible is as close to a replacement to programming as you can get. It’s automation for everyone. From people who don’t know how to program, to people that do…and to people who don’t want to know how to program!

Imagine, if you will, that you’re me from 20 years ago — new to tech and new to the idea of automation. For people like me, who want to start automating their everyday things, I’ll likely want to start with all the stuff I’m copy/pasting from a Word doc into a device terminal. Nobody should need to learn a new programming language just to start automating things.

IT tools for the masses are all but dead in the water if they require in-depth programming knowledge to even begin understanding how they work. Puppet is borderline, with its nightmare learning curve. And don’t even bother thinking about Chef unless you already fully competent with Ruby. I say this having been a former user/developer with both — never again!

The beauty of Ansible is that you can have an entire team of people pick it up and start using it in almost no time, regardless of how new or experienced they may already be. If you want to quickly learn how to automate things, look no further!

Getting Started with Ansible

Ansible doesn’t have a steep learning curve and it doesn’t require any sort of programming background to use. You can begin running commands against your network inventory in no time at all. And I can prove it!

This next section will overview how to start using Ansible. Download and install it, make an inventory, and then run a playbook against your network — in less than five minute!

Part Two:
http://www.landoman.com/2020/02/07/getting-started-with-network-automation-part-2/

The Coyote Conference Call

I’m a consultant by day, and a goat farmer by night. The two lifestyles offer a wonderful mix of totally unrelated and fulfilling things for me to do. From daily interactions with bleeding-edge tech, to a backyard with goats, chickens, and turkeys. These two parts of my life rarely intersect. Until they do!

To give you a quick background, I’m often working a day or two in Seattle, and remotely from home the rest of the week. When I’m working from home, I usually spend a few hours a day talking with a nearly unforeseen mix of colleagues, clients, and customers. On this particular day, I had a call with a hard drive and flash storage manufacturer.

So I’m on this call. I love these sorts of things, too, and I get pretty absorbed in conversations. Things start to wrap up, and then they go back into discussion. I’m chatting along for 45 minutes, walking around my living room, just blindly mulling about. I turn and glance outside to see a coyote staring me down from straight through the back window. Real big one too.

I stopped mid-sentence into whatever I was talking about and said, “Hold on guys, I’ve got to step away for a minute. There’s a coyote in my backyard.” There’s a brief moment of awkward silence before a few people start laughing. I remember one person snorting and saying, “Hah, he doesn’t live in the city does he?” Another person chimes in and says, “We’re about to hear a gunshot aren’t we?” Smart…I muted the line as I rounded the garage with my fancy pellet gun.

By that time, the coyote was going into our neighbors’ field. I bounded to the length of driveway and noticed that it had stopped in the middle of their field. It just watched me. So I walked up to the driveway hedge, found a low spot, and shot it with my fancy pellet gun. I hit it and it ran away feeling the sting of my mighty pea-shooter.

Things were wrapping up on the call by then. I unmuted and joined in with the goodbyes, and I messaged my colleagues to let them know that I’d be sending over my notes soon. Then I had a mild panic attack. That thing was staring me down through my living room window as I gave pointers on automating Linux servers and network devices.

But at least I didn’t drop the call!

Seattle

I’ve spent month’s trying to write about Seattle. How to describe the city, what things make it interesting, the features and attractions that draw travelers and enchant residents. There are so many things to write about that it became an exercise in redundancy. There are a hundred tourist guides to list the best restaurants, the best museums, the best space needle, the best legal cannabis stores that have generated over a billion dollars in the past three years. Really. It’s mind-boggling.

Amidst the dozens of local tours, you can learn all about Seattle’s history starting in the mid-1800s. An immense logging industry at one time supplied almost half the world with some of the finest lumber. The real fun started in the midst of the gold rush — the city exploded to support miners before they set off through the thousand miles of nothing between Seattle and the maddening dream of Alaskan gold. The last city before the endless expanse of the north. The city also burned down three times, but half a century later it was well on its way to becoming a center of technology. The local government began promoting the tech industry in the ‘60s, and they set the foundation for Seattle’s eventual rise to the tech giant that it is today.

You can find out all about the region’s incredible array of outdoor-y stuff. The rugged Pacific coast, and rain forests of the Olympic Mountains. Two giant volcanoes — one that gets more snow than any other mountain in North America, another that’s utterly majestic and ready to explode and kill us all. There’s world-class skiing and adventuring in the Cascades an hour to the east, and beyond that some of the best wine vineyards in the world. And 75% of the country’s beer hops are grown there too. And Walla Walla is in a legit desert. The Puget Sound is the most biodiverse sea on the planet, and the resident Orca whales are super cool.

Seattle is great. I just proved it with legal weed and all that other cool shit. So after reading all of that, now you know all about it too!

Until recently, you could sum up my thoughts about Seattle like I was reading from a Lonely Planet guide — all the things that make Seattle a nice city. The bragging rights, so-to-speak. Only after I left did I start thinking about everything else that truly makes the city great. The stuff you take for granted, the good and the bad that make Seattle feel like an incredible place to live. The emotions you’ll experience that will either drive you away or draw you in forever.

As you get to know Seattle, you learn about the problems faced by a city growing far faster than it can accommodate. Traffic sucks, public transit is slow, housing prices are skyrocketing, and government can’t keep up with the constant effort required to quickly make changes. Seattle spends more on social programs than any other city in the country, and we have the largest homeless population in the country. Fortunately Washington has one of the highest voting rates in the country, and the average citizen is allowed to raise support for putting measures on the ballot. Thanks in large part to that, Seattle has one of the highest standards of living in the world, and people fight to keep it that way.

Everywhere you go, you’re reminded that you’re somewhere special, near the end of the world. The vast expanse of the Pacific Ocean is a whole lot of cold nothing for a few thousand miles. You can hike the Cascades up and down for years, and you can die real quick on Rainier if you’re unlucky — do it right and your body will never be recovered! Two hours to the north, Vancouver is a sight to behold in its own right, and Canada is just great in general. And then it’s another thousand miles of more nothing to Alaska…where there’s even more nothing.

The point being that living in Seattle, we’re here for a reason and we’re all in it together. For the most part, we all really love it here. Like so many others, we were drawn here by the need to get away, to do something new in this strange, enchanting city at the end of civilization. We all knew about the rain, but few can ever truly embrace the hundreds of dark, wet, overcast days. Even the most resilient will often sink into despair after living in darkness for 100 days at a time. But at some point we do so willingly. At least the dark, damp environment grows the softest lawn mosses and makes for a city that’s lush and green year-round.

But then there’s summer in Seattle. It’s like nothing else, really. It’s short, generally from the 4th of July to late-September. Certainly no later than that. But it’s what makes winter despair worth every damn minute. I experienced it for the first time when we moved here in 2012. That summer broke a record for the longest stretch without rain, and it was a sublime welcome to the Pacific Northwest. The happiness that people feel during a Seattle summer is quite tangible. I once again felt that same excitement yesterday as I flew into Seattle — the last day of winter, with a perfect sunset in a nearly cloudless sky. The air is still cool, but the days are quickly growing longer while the sun grows warmer. Suddenly, people are happier. It’s a nearly instant transformation that sets over the entire city in that very moment. The first beautiful day that heralds the changing seasons.

Suddenly the traffic becomes slightly more bearable when you can get home before sunset. You start to pretend that we’ll not need your rain jackets, and the thought of fully abandoning them will occasionally ruin a hike or kill you on Rainier if you’re unlucky. You’ll make excuses to start going on walks in the evenings. You’re able to sit outside for dinner with increasing frequency, if not while being cold because you didn’t think you’d still need a jacket. It’s hope. It just gets better every day.

When summer finally arrives the morning after July 4th, so begins one of the greatest human experiences — usually a solid three months of cloudless skies, temperatures in the mid-80s, and the most blissfully warm, comfortable heat. You can be outside all day and not be hot. And make no mistake, you’ll be outside a lot. You and everyone else in the whole damned Pacific Northwest. The world is truly alive, from San Francisco to Anchorage. One giant swath of happiness, reveling in summer. There’s no better reward for battling half the year in darkness. It feels like it lasts forever, and you love every minute of summer. Every damn minute. In fact, you’re so happy that you welcome the first fall rains that begin turning the air crisp and cool. The next couple months are the most perfect, and it’s only once the heavy, cold November storms set in that you begin the cycle anew.

Do that a few times and you’ll begin to really appreciate everything that goes on out here. A deep respect develops for the sometimes oppressive northwest climate that dictates the livelihood of everyone living here. You begin to understand how the world around you directly impacts your happiness and well-being. Everything in the Pacific Northwest is connected, and the rich biodiversity that developed has been revered and cultivated. Seattle has so much to offer; people come here for a reason. They want something new. They live for the future. They embrace the local economy, they encourage sustainability, and they promote high social standards.

I of course had those experiences, felt those emotions, and shared those ideals, but I never started putting it all together until I left. Indeed, one last Mighty-O donut and we were off, and the immediate understanding of what it all meant began forming in my mind. Exactly what I was leaving behind — everything that made Seattle incredible. A city so far out of the way. There’s nothing like it in the world.

Welcome Back?

It’s been a while since I’ve had this site up and running…so welcome back, I suppose!