Defining Code and Configuration in Cloud Environments
- June 15, 2017
Where does code end and configuration begin? It’s a perennial question. And, as the lines continue to blur between the two we here at Flux7 have begun defining new terminology and ways of thinking of the issue that we’d like to propose in our blog today. As part of a community of developers whose work often entrenches them in these issues, we greatly appreciate your feedback and help in fine-tuning these definitions for the betterment of our collective work.
As we all know, these definitions were once fairly simple: if you could touch and feel it, it was hardware and software was what ran on that hardware. This definition persisted for quite some time, until virtual machines (VMs) emerged on the scene and really began to blur the lines. As we all know, VMs aren’t really hardware or software per se, but rather something in between. Suddenly hardware became something that could be virtually provisioned and no longer had to be shipped in a box to your physical address.
Cloud providers came in and blurred the line even further as they provisioned not just VMs but a variety of hardware — from routers to racks. Yet, these former data center components aren’t software either. We refer to them as Infrastructure-as-a-Service (or IaaS) and though you can no longer touch and feel them, they are definitely still part of the overall technology stack.
Moreover, the lines continue to blur as cloud providers increasingly provision OSs for their clients as well. In our experience working with AWS cloud users, most customers don’t change the OS once it is provisioned, rather they use the version that AWS provides, opting for an OS image with a specific purpose, e.g. using hardened CIS-benchmarked images for PCI workloads.
A New Definition
Clearly new definitions and boundaries are needed. Given our extensive hands-on work with enterprise cloud infrastructure, we are proposing the following layer boundaries with a definition of infrastructure as everything up to and including the virtual machine.
Infrastructure that is provided by the cloud provider, including the virtual machine, VPCs, subnets, routers, EBS volumes, etc.
Consists of the operating system and anything that comes pre-baked in the images by AWS or another cloud provider.
Layer 3/Landing Zone Configuration:
Software needed to operationalize applications in the cloud, e.g. logging agents, anti-virus software, vulnerability detection tools, or in-house monitoring tools that run on all VMs as part of the corporate standard build. As you can see, this layer is made up of broader, application-agnostic yet organization-specific solutions. Most customers add this layer on Layer 2 to create a “Golden AMI” for their organization which all VMs are expected to start from.
Layer 4/App Configuration:
This layer contains a class of software that is developed by a third party that must exist on the machine in order to make the initial application work. Also referred to as middleware, this layer includes third-party software required by a given application, e.g., Python or Java Runtime.
Layer 5/Build Artifacts:
Application-specific code, e.g., the python code itself or a JAR or WAR file for JAVA-based apps.
Putting It Into Practice
Let’s look at a few example questions to apply these boundaries and definitions:
What is configuration and what is code for a hosted OS and CRM? Working with a services organization, this question arose as the firm was running Sugar CRM that had two additional required components: PHP and the SugarCRM engine. In this case, both elements are considered Layer 4 as they are application-specific requirements. However, the firm also needed to customize the CRM engine with plug-ins and packages. The need to add application specific code to the CRM engine moves this last piece to Layer 5.
Using the NextFlow pipeline engine, what is code and what is configuration? Our AWS consultants encountered this question recently at a life sciences organization who uses NextFlow to help with its scientific workflows. NextFlow allows the adaptation of pipelines written in the most common scripting languages, which allows organizations to define their pipelines specific to their organizational requirements. In this case, the NextFlow engine is considered Layer 4 as it is an element required by the NextFlow application. The pipeline definitions are classified as Layer 5 because it is application-specific code needed to tailor NextFlow to this specific company.
Are models and scripts written to interact with QlikView, a guided analytics app, considered code or configuration? We were faced with this questions in our work with a marketing technology provider who uses QlikView as part of its technology stack. QlikView requires an engine to operate within the environment. Answering this question with our above outlined boundaries, Flux7 consultants define the QliqView engine as Layer 4. And though not a traditional view of coding, the models and scripts that the firm needs to write internally to interact with QliqView are Layer 5.
While it’s easy to get caught up in definitions, at the end of the day, we need tools to actively and effectively manage the environment. Luckily, definitions and boundaries also help us identify the best-fit tools for a given task. To further put these definitions into action, we have outlined in the graph below several tools as examples for the different boundary layers — to further the illustration, you’ll notice that some of the examples here are taken from our questions above:
|Layer||Examples||Task||Common Deployment Options|
|Infrastructure||Networking, security, etc.||Infrastructure Delivery||AWS CloudFormation, TerraForm, Ansible|
|OS||OS||Infrastructure Delivery||AWS CloudFormation, TerraForm, Ansible|
|AWS Inspector, Splunk Agent||Configuration Delivery||Ansible, Chef, Puppet, SaltStack
With HashiCorp Packer to bake the Golden Image.
|Qlik, Python, NextFlow, Sugar CRM||Configuration Delivery||Ansible, CloudFormation::Init, Chef, Puppet, SaltStack (can use Packer to bake AMI or provision these on the fly as the instance boots)|
|Build Artifacts||Anything customer specific e.g. models, scripts, plug-ins||CodeDeploy||AWS CodeDeploy Manual, Jenkins, TeamCity|
As part of a growing community of DevOps professionals working within the AWS environment, our goal is to help clarify terminology used in such a way that it benefits us all, making the task of developing and managing effective AWS environments that much easier and more effective. What do you think? Is it important to have new definitions as technology infrastructure evolves? And if so, do you find these new boundaries and definitions helpful? Leave your comment below.