Hackfoofery

Alson Kemp

Organizing Terraform Projects

Written by alson

February 14th, 2018 at 11:26 am

Posted in Turbinado

Bookmark and Share

At Teckst, we use Terraform for all configuration and management of infrastructure.  The tidal boundary at the intersection of infrastructure and application configuration is largely determined by which kinds of applications will be deployed on which kinds of infrastructure.  Standing up a bunch of customized EC2 instances (e.g.Cassandra)?  Probably something that Ansible or Chef is better suited to as they’re built to step into a running instance and create and/or update its configuration (though, certainly, this can be done via Terraform and EC2 User Data scripts).

Teckst uses 100% AWS services and 100% Lambda for compute so we have a much more limited need.  We need Lambda Functions, API Gateways, SQS Queues, S3 Buckets, IAM Users, etc to be created and wired together; thereafter, our Lambda Function are uploaded by our CI system and run over the configured AWS resources.  In this case, Terraform is perfect for us as it walks our infrastructure right up to the line at which our Lambda Functions take over.

Terraform’s documentation provides little in the way of guidance on structuring larger Terraform projects.  The docs do talk about modules and outputs, but no fleshed-out examples are provided for how you should structure your project.  That said, many guides are available on the web ([1][2][3] are the top three Google results as of this writing).

Terraform Modules

Terraform Modules allow you to create modules of infrastructure which accept/require specific Variables and yield specific Outputs.  Besides being a great way to utilize third-party scripts (e.g. a script you find on Github to build a fully configured EC2 instance with Nginx fronting a Django application), Modules allow a clean, logical separation between environments (e.g. Production and Staging).  A good example of organizing a project using Terraform Modules is given in this blog post.  Initially, we approached organizing scripts similarly:

/prod/main.tf
/prod/vpc.tf - production configs for VPC module 
/staging/main.tf
/staging/vpc.tf - staging configs for VPC module
/modules/vpc/main.tf - contains staging configs for VPC module
/modules/vpc/variables.tf
/modules/vpc/outputs.tf

Now all of our prod configuration values are separate from our staging configuration values.  The prod and staging scripts could reference our generic vpc Module.  Initially, this seemed like a huge win.  Follow on to find out how it might not be a win for in-house-defined infrastructure.

Terraform Modules and First-Party Scripts

While Terraform Modules are a great way to use third party Terraform scripts to build your infrastructure, they felt awkward when using only first party scripts/definitions.  Reviewing other ways of organizing the scripts, they seemed to rely on Copy-and-Paste and that’s a Bad Thing, right?  But what if Copy-and-Paste is unavoidable?  With Terraform Modules, each module defines a set of Variables it accepts and it defines a list of Outputs it outputs.  In order to utilize Modules in first-party circumstances, you will need to determine a naming scheme for the Variables and Outputs (when using a third-party module, you must adopt their naming scheme).  Any mismatches between the names of Variables or Outputs will yield baffling error messages.  So a lot of time is spent threading Variables up into sub-Modules only to grab and thread Outputs from those sub-Modules down and then up into other sub-Modules via other Variables.  The best example of this is VPC information: nearly every Module will need to have the VPC.id threaded up into the module.  The safest way to make sure your Variable declarations line up with the Module’s Variable requirements is to Copy-and-Paste the names to-from the Modules.  Consider your VPC and Route53 scripts:

/main.tf
/terraform.tfvars
/vpc/main.tf
/vpc/outputs.tf
/vpc/variables.tf
/route53/main.tf
/route53/outputs.tf
/route53/variables.tf

We’ll need to:

  1. In main.tf: reference the vpc Module, passing in appropriate Variables.
  2. In /vpc/variables.tf: define Variables this Module will accept.
  3. In /vpc/outputs.tf: map the Module’s configuration values to the Module’s Outputs.
  4. In /vpc/main.tf: create the VPC using the Variables specified in /main/variables.tf.
  5. In /main.tf: reference the route53 Module, passing in as Variables the Outputs retrieved from the vpc Module.
  6. Then do Steps 2-5 for the /route53 Module.

The above has three significant issues:

  1. It’s tedious.  That’s something like 11 steps to create a VPC and Route53 domain.  That’s about 8 lines in a single Terraform script…  Here we’ve got 2-3 times that scattered across 7 files.
  2. It’s error-prone.  Didn’t get your naming convention right for one of the Outputs of some Module?  That’s going to happen.  And the Terraform errors aren’t too helpful in finding the issue.
  3. It’s based on Copy-and-Paste!  The root of all evil!  Sure, you don’t need to Copy-and-Paste but you’re going to do so as, for the hundredth time, you write into a Variable declaration the name of a Module’s Output…  And as you mispell the Output‘s name…

Less-Bad Terraform Organization

So I’m not sure we can avoid Copy-and-Paste.  If we can’t avoid it, can we organize our Terraform scripts in such a way that we minimize likely harm?  We approached it as follows, attempting to:

  • Strictly limit the surface area of the production and staging configurations so that it was obvious which was which.
  • Eliminate the differences in everything outside of the above differences in production and staging configurations areas.
  • Make clear how to migrate staging infrastructure alterations forward to production infrastructure.
  • Make clear where Copy-and-Paste is allowed.
  • Simplify our usage of Terraform.

We wound up with the following (very simplified) organization:

/prod/vpc.tf
/prod/route53.tf
/prod/variables.tf
/staging/vpc.tf
/staging/route53.tf
/staging/variables.tf

That’s it.  What’s this get us:

  • Very simple Terraform usage.  No need to trace Variables and Outputs up/down through Modules.  route53.tf can directly reference aws_vpc.main.id.
  • Very limited surface area for configuration.  All environment specific configuration values go in variables.tf.  It’s easy to compare the prod and staging variables.tf to see new or changed variables.
  • Simple migration to prod.  Just meld (or diff) staging and production.  We know the variables.tf will differ, so, with suitable review, the content of staging can be quickly and easily pulled into production.

An example of the content of /staging/variables.tf:

########################
# ACCOUNT SETTINGS
########################
variable "environment"    {default = "staging"}
variable "account_id"     {default = "acct-staging"}
variable "account_number" {default = 123456789 }
variable "aws_region"     {default = "us-east-1"}

########################
# DOMAIN SETTINGS
########################
variable "root_domain"         {default = "not-real.com"}
variable "root_private_domain" {default = "not-real.internal"}

Additional Considerations/Practices

Is it that simple?  Not quite, but it’s close.  We have some development resources in our staging environment that don’t exist in our production environment (looking at you crazy MS SQL Server instance) but it’s pretty obvious to anyone involved in the infrastructure that those resources are staging only.

Currently, one or two people do all of the infrastructure development, so we haven’t really wrestled with conflicts in the terraform.tfstate but that will be an issue regardless of the organization of Terraform files.

without comments

Simple Decoders for Kids

Written by alson

February 25th, 2014 at 10:25 pm

Posted in Geekery

Bookmark and Share

My wife created simple symbol-letter decoders for my son.  He thought they were a lot of fun and wanted to share them with friends, so I productized them.  Screenshot here:

Screenshot from 2014-02-27 12:12:45

Simple, straightforward way to build fun little puzzles for kids.   Play with it here.  Besides changing the phrase, you can add additional confounding codes or remove codes to force kids to guess at the phrase.  Then click the Print button and you’ll have a nice printout with the control panel hidden.

I’m building a 2-D version for the codes, too, so that will be along later this week.

with one comment

WebGL Fractals

Written by alson

December 28th, 2013 at 4:00 pm

Posted in Geekery

Bookmark and Share

Years ago, I wrote a fractal generator/explorer for OpenGL.  Crazily enough, after nearly 10 years,  it still compiles without complaint on Linux.  But the web is the future [er… or, rather, the present], so…

So I ported the the C version to Coffeescript, AngularJS, LESS, Jade and [insert buzzword].  The port was actually very straightforward with the majority of time spent on building the UI, fiddling with AngularJS, adding fractals, refactoring, etc.  Nothing in the code is too surprising.  One controller handles the UI, two services manage application state and one service renders the fractal.

The app is here.  The code is on GitHub here.  To “compile” the code, you’ll need the NodeJS compilers for Coffeescript, LESS and Jade.  Then run ./scripts/run_compilers.sh.  (Yes, I could have used Grunt or Gulp, but the simple bash script is really simple.)

Screenie:

 web-fract-3d

 

Interesting links:

  1. Link
  2. Link
  3. Link
  4. Link
  5. Link
  6. Link

Pull requests, comments, suggestions, etc always welcome.  In particular, are there other fractals that you’d suggest?

without comments

Go experiment: de-noising

Written by alson

May 13th, 2013 at 11:34 pm

Posted in Programming

Bookmark and Share

CoffeeScript is a great example of how to de-noise a language like Javascript. (Of course, I know people that consider braces to be a good thing, but lots of us consider them noise and prefer significant whitespace, so I’m speaking to those folks.) What would Go code look like with some of CoffeeScript’s denoising?

TL;DR : the answer is that de-noised Go would not look much different than normal Go…

As an experiment, I picked some rules from CoffeeScript and re-wrote the Mandelbrot example from The Computer Benchmarks Game. Note: this is someone else’s original Go code, so I can’t vouch for the quality of the Go code….

Here’s the original Go code:

/* targeting a q6600 system, one cpu worker per core */
const pool = 4

const ZERO float64 = 0
const LIMIT = 2.0
const ITER = 50   // Benchmark parameter
const SIZE = 16000

var rows []byte
var bytesPerRow int

// This func is responsible for rendering a row of pixels,
// and when complete writing it out to the file.

func renderRow(w, h, bytes int, workChan chan int,iter int, finishChan chan bool) {

   var Zr, Zi, Tr, Ti, Cr float64
   var x,i int

   for y := range workChan {

      offset := bytesPerRow * y
      Ci := (2*float64(y)/float64(h) - 1.0)

      for x = 0; x < w; x++ {
         Zr, Zi, Tr, Ti = ZERO, ZERO, ZERO, ZERO
         Cr = (2*float64(x)/float64(w) - 1.5)

         for i = 0; i < iter && Tr+Ti <= LIMIT*LIMIT; i++ {
            Zi = 2*Zr*Zi + Ci
            Zr = Tr - Ti + Cr
            Tr = Zr * Zr
            Ti = Zi * Zi
         }

         // Store the value in the array of ints
         if Tr+Ti <= LIMIT*LIMIT {
            rows[offset+x/8] |= (byte(1) << uint(7-(x%8)))
         }
      }
   }
   /* tell master I'm finished */
   finishChan <- true

My quick de-noising rules are:

  • Eliminate var since it can be inferred.
  • Use ‘:’ instead of const (a la Ruby’s symbols).
  • Eliminate func in favor of ‘-> and variables for functions.
  • Replace braces {} with significant whitespace
  • Replace C-style comments with shell comments “#”
  • Try to leave other spacing along to not fudge on line count
  • Replace simple loops with an “in” and range form

The de-noised Go code:

# targeting a q6600 system, one cpu worker per core
:pool = 4

:ZERO float64 = 0  # These are constants
:LIMIT = 2.0
:ITER = 50   # Benchmark parameter
:SIZE = 16000

rows []byte
bytesPerRow int

# This func is responsible for rendering a row of pixels,
# and when complete writing it out to the file.

renderRow = (w, h, bytes int, workChan chan int,iter int, finishChan chan bool) ->

   Zr, Zi, Tr, Ti, Cr float64
   x,i int

   for y := range workChan
      offset := bytesPerRow * y
      Ci := (2*float64(y)/float64(h) - 1.0)

      for x in [0..w]
         Zr, Zi, Tr, Ti = ZERO, ZERO, ZERO, ZERO
         Cr = (2*float64(x)/float64(w) - 1.5)

         i = 0
         while i++ < iter && Tr+Ti <= LIMIT*LIMIT
            Zi = 2*Zr*Zi + Ci
            Zr = Tr - Ti + Cr
            Tr = Zr * Zr
            Ti = Zi * Zi

         # Store the value in the array of ints
         if Tr+Ti <= LIMIT*LIMIT
            rows[offset+x/8] |= (byte(1) << uint(7-(x%8)))
   # tell master I'm finished
   finishChan <- true

That seems to be a pretty small win in return for a syntax adjustment that does not produce significantly enhanced readability. Some bits are nice: I prefer the significant whitespace, but the braces just aren’t that obtrusive in Go; I do prefer the shell comment style, but it’s not a deal breaker; the simplified loop is nice, but not incredible; eliding “var” is okay, but harms readability given the need to declare the types of some variables; I do prefer the colon for constants. Whereas Coffeescript can dramatically shorten and de-noise a Javascript file, it looks as though Go is already pretty terse.

Obviously, I didn’t deal with all of Go in this experiment, so I’ll look over more of it soon, but Go appears to be quite terse already given its design…

with 2 comments

[Synthetic] Performance of the Go frontend for GCC

Written by alson

May 5th, 2013 at 2:35 pm

Posted in Programming

Bookmark and Share

First, a note: this is a tiny synthetic bench.  It’s not intended to answer the question: is GCCGo a good compiler.  It is intended to answer the question: as someone investigating Go, should I also investigate GCCGo?

While reading some announcements about the impending release of Go 1.1, I noticed that GCC was implementing a Go frontend.  Interesting.  So the benefits of the Go language coupled with the GCC toolchain?  Sounds good.  The benefits of the Go language combing with GCC’s decades of x86 optimization?  Sounds great.

So I grabbed GCCGo and built it.  Instructions here: http://golang.org/doc/install/gccgo

Important bits:

  • Definitely follow the instructions to build GCC in a separate directory from the source.
  • My configuration was:

/tmp/gccgo/configure --disable-multilib --enable-languages=c,c++,go

I used the Mandelbrot script from The Benchmarks Game at mandlebrot.go.  Compiled using go and gccgo, respectively:

go build mandel.go
gccgo -v -lpthread -B /tmp/gccgo-build/gcc/ -B /tmp/gccgo-build/lto-plugin/ \
  -B /tmp/gccgo-build/x86_64-unknown-linux-gnu/libgo/ \
  -I /tmp/gccgo-build/x86_64-unknown-linux-gnu/libgo/ \
  -m64 -fgo-relative-import-path=_/home/me/apps/go/bin \
  -o ./mandel.gccgo ./mandel.go -O3

Since I didn’t install GCCGo and after flailing at compiler options for getting “go build” to find includes, libraries, etc, I gave up on the simple “go -compiler” syntax for gccgo. So the above gccgo command is the sausage-making version.

So the two files:

4,532,110 mandel.gccgo  - Compiled in 0.3s
1,877,120 mandel.golang - Compiled in 0.5s

As a HackerNewser noted, stripping the executables could be good. Stripped:

1,605,472 mandel.gccgo
1,308,840 mandel.golang

Note: the stripped GCCGo executables don’t actually work, so take the “stripped” value with a grain of salt for the moment. Bug here.

GCCGo produced an *unstripped* executable 2.5x as large as Go produced. Stripped, the executables were similar, but the GCCGo executable didn’t work. So far the Go compiler is winning.

Performance [on a tiny, synthetic, CPU bound, floating point math dominated program]:

time ./mandel.golang 16000 > /dev/null 

real  0m10.610s
user  0m41.091s
sys  0m0.068s

time ./mandel.gccgo 16000 > /dev/null 

real  0m9.719s
user  0m37.758s
sys  0m0.064s

So GCCGo produces executables that are about 10% faster than does Go, but the executable is nearly 3x the size.  I think I’ll stick with the Go compiler for now, especially since the tooling built into/around Go is very solid.

Additional notes from HN discussion:

  • GCC was 4.8.0.  Go was 1.1rc1.  Both AMD64.

with 8 comments

Parsing a DICOM file for fun and profit

Written by alson

January 18th, 2013 at 11:34 pm

Posted in Turbinado

Bookmark and Share

For various reasons, I needed to have a CT scan of my jaw done.  This is a very high resolution 3D scan.  The technician needed to review the scan to make sure it was of high quality and I stood behind him and looked over his shoulder.  The software was pretty impressive, but the 3D model and resolution were really impressive.  And then I left the office and drove home…

… and as I was driving, I thought: wouldn’t it be fun to have a copy of the data?; perhaps I could build a point cloud and shoot it into a crystal (as I’d done with fractals)?  So I called back the lab (Park XRay) and asked if I could have a copy of the data.  “Sure!  It’s your skull.” was the reply and they delivered an extra copy to my dentist.

The files were in DICOM format and were produced or for-use by iCATVision.  Fortunately, Python has a DICOM library, so it was fairly easy to parse the files.  My code is on GitHub.  [The code is not pretty, but it worked.]

I’ve previously “printed” point clouds into crystals using Precision Laser Art, so I needed to convert the 448 16-bit slices of my jaw into 1-bit XYZ point clouds.  “visualize.py” provides a simple 2D visualization of the slices.  Most importantly, it let me tune the threshold values for the quantizer so that the point cloud would highlight interesting structures in my jaw.  Here’s the interface (perhaps it’s obvious, but I’m not a UX expert…):

Once I’d tuned the parameters, I added those parameters to “process.py” and generated the giant XYZ point cloud.  The format of the point cloud is just:

X1 Y1 Z1

X2 Y2 Z2

X3 Y3 Z3

[repeat about 3 million times...]

I sent my order to Precision Laser Art and, after 7 days and $100, received this:

Which had a nicely cushioned interior:

And this is the resulting crystal.  It’s the C-888 80mm cube.

While it’s not amazingly easy to see in this photo, my vertebrae and hyoid bone are clearly visible in the crystal.

Anyhow, the point is: medical data is cool.  You can get it, so get it and play with it!  😉

without comments

Checking memcached stats and preventing empty responses

Written by alson

October 10th, 2012 at 12:32 pm

Posted in Tools

Bookmark and Share

A quick google for how to check stats for memcached quickly turns up the following command:

echo stats | nc internal.ip 11211

Netcat is a utility for poking about in just about all network interfaces or protocols, so can be used to pipe  information to memcached.  Note: you’ll need to have netcat installed in order to have the “nc” command and Debian/Ubuntu have both netcat-traditional and netcat-openbsd. Install the openbsd version.

The problem I had was that checking stats returned a blank response about 90% of the time.  The cause of this issue is that netcat sends the string “stats” to memcached, declares victory and then closes the connection before memcached has a chance to reply.  Solution? Just tell netcat to wait a bit using the “-i” flag which waits after sending lines of text. Like this:

echo stats | nc -i 1 internal.ip 11211

To check a remote machine, I wound up with:

 ssh the_remote_machine "echo stats | nc -i 1 internal.ip 11211"

without comments

Django: handling configs for multiple environments

Written by alson

October 7th, 2012 at 2:42 pm

Posted in Turbinado

Bookmark and Share

A common way to manage settings across environments in Django is to create a local_settings.py file and then copy into it environment specific settings during deployment.  Although much of our web work is done in Django now, the Rails way of managing environments is superior.

In your project, create a settings_env directory and put into it local.py, dev.py, etc files for environment specific setup.

## ^^ standard settings.py above
# Import environment specific settings
# Pull in the settings for specific environments
# It's the last argument.
env = os.environ.get('DJANGO_ENV')

if env == "production" : from settings_env.dw      import *
elif env == "staging"  : from settings_env.staging import *
elif env == "dev"      : from settings_env.dev     import *
elif env == "local"    : from settings_env.local   import *
else: 
    print "######################################################"
    print " No environment specified or specified environment"
    print " does not exist in /settings_env/."
    print " Continuing with no settings overrides."
    print " To specify an environment (e.g. production), use"
    print "  DJANGO_ENV=production ./manage.py runserver"
    print "######################################################"
    quit()
if DEBUG=True:
  ## ^^ settings.py for DEBUG = True below

without comments

Tunnel MySQL over SSH to remote server issue

Written by alson

September 7th, 2012 at 2:29 pm

Posted in Turbinado

Bookmark and Share

There are a million pages about this, but I just bumped into a tricky issue and figured I’d drop a quick note about it.

First off, tunnel MySQL to the server (on a UNIXy box) by doing:

ssh  -L 33306:localhost:3306 your-server.com

Simply, that tells SSH to listen on local port 33306, a port chosen because it’s an obvious variation on MySQL’s default of 3306).  When something connects to that port, SSH will accept the connection and will forward it to the remote host, which will then connect it to the appropriate host and port on the far side.  In this case, we’re asking the server to connect to port 3306 on localhost, but you could connect to any server and any port.

The tricky issue was that on my Debian laptop, MySQL uses socket file for communcation even if you specify a port.  So this will fail and have you talking with your local MySQL instance:

mysql --port=33306  -u your_remote_user -pyour_remote_password

In order to force MySQL to use TCP (instead of sockets), you can force the connection protocol or specify a host (note: you can’t use ‘localhost’; you need to use 127.0.01):

mysql --protocol=tcp --port=33306 \
   -u your_remote_user -pyour_remote_password
mysql -h 127.0.0.1 --port=33306  \
   -u your_remote_user -pyour_remote_password

without comments

My Apache process is only using one core!

Written by alson

January 4th, 2012 at 11:41 pm

Posted in Geekery

Bookmark and Share

I was recently working on a client site (a good-sized one) and was checking on the health of their application servers.  I noticed that each of their app servers was running a few of the cores much harder than the other cores.  This was in the evening and they get most of their traffic during the day; it runs Django under mod_wsgi in daemon mode with 8 processes and 25 threads per process.  Further, the boxes were not VPSs/VMs, but were dedicated, multicore boxes.  So they had multicore hardware and the web server was running in a multicore friendly way.

At the time, the load for each box was around 0.5.  And various process IDs rotated as the top CPU users, so process IDs weren’t bound to a core.  The ratio of traffic between the cores (ignoring actual core number and focusing on the utilization of each core, since each box was different) was something like:

Core # : Core Utilization

1      : 15%

2      : 2%

3      : 1%

*      : 0%

So why would one processor bear most of the load?  I googled and googled, and found little useful information.   I banged on one server with Apache’s benching tool (“ab”) while watching core utilization and, sure enough, all cores shared the load equally.  So what was going on?

I’m not sure if it’s the Linux kernel or a natural outcome of CPU caches, but the simplest explanation is that in low load situations processes are similar, due to cache coherence, will flock to the same core.  Rather than spreading a set of processes across a set of cores that don’t necessarily share the same cache, processes naturally gravitate to the cores that experience the lowest cache misses.

Upshot: it’s rational for the system to schedule most operations of a process or group of similar processes on one core when a system is relatively lightly loaded.  This is especially true if the cores are “Hyperthreading” and are sharing resources (read: their caches)!

with one comment