Alson Kemp

Hackfoofery

Parsing a DICOM file for fun and profit

without comments

For various reasons, I needed to have a CT scan of my jaw done.  This is a very high resolution 3D scan.  The technician needed to review the scan to make sure it was of high quality and I stood behind him and looked over his shoulder.  The software was pretty impressive, but the 3D model and resolution were really impressive.  And then I left the office and drove home…

… and as I was driving, I thought: wouldn’t it be fun to have a copy of the data?; perhaps I could build a point cloud and shoot it into a crystal (as I’d done with fractals)?  So I called back the lab (Park XRay) and asked if I could have a copy of the data.  “Sure!  It’s your skull.” was the reply and they delivered an extra copy to my dentist.

The files were in DICOM format and were produced or for-use by iCATVision.  Fortunately, Python has a DICOM library, so it was fairly easy to parse the files.  My code is on GitHub.  [The code is not pretty, but it worked.]

I’ve previously “printed” point clouds into crystals using Precision Laser Art, so I needed to convert the 448 16-bit slices of my jaw into 1-bit XYZ point clouds.  “visualize.py” provides a simple 2D visualization of the slices.  Most importantly, it let me tune the threshold values for the quantizer so that the point cloud would highlight interesting structures in my jaw.  Here’s the interface (perhaps it’s obvious, but I’m not a UX expert…):

Once I’d tuned the parameters, I added those parameters to “process.py” and generated the giant XYZ point cloud.  The format of the point cloud is just:

X1 Y1 Z1

X2 Y2 Z2

X3 Y3 Z3

[repeat about 3 million times...]

I sent my order to Precision Laser Art and, after 7 days and $100, received this:

Which had a nicely cushioned interior:

And this is the resulting crystal.  It’s the C-888 80mm cube.

While it’s not amazingly easy to see in this photo, my vertebrae and hyoid bone are clearly visible in the crystal.

Anyhow, the point is: medical data is cool.  You can get it, so get it and play with it!  😉

Written by alson

January 18th, 2013 at 11:34 pm

Posted in Turbinado

Checking memcached stats and preventing empty responses

without comments

A quick google for how to check stats for memcached quickly turns up the following command:

echo stats | nc internal.ip 11211

Netcat is a utility for poking about in just about all network interfaces or protocols, so can be used to pipe  information to memcached.  Note: you’ll need to have netcat installed in order to have the “nc” command and Debian/Ubuntu have both netcat-traditional and netcat-openbsd. Install the openbsd version.

The problem I had was that checking stats returned a blank response about 90% of the time.  The cause of this issue is that netcat sends the string “stats” to memcached, declares victory and then closes the connection before memcached has a chance to reply.  Solution? Just tell netcat to wait a bit using the “-i” flag which waits after sending lines of text. Like this:

echo stats | nc -i 1 internal.ip 11211

To check a remote machine, I wound up with:

 ssh the_remote_machine "echo stats | nc -i 1 internal.ip 11211"

Written by alson

October 10th, 2012 at 12:32 pm

Posted in Tools

Django: handling configs for multiple environments

without comments

A common way to manage settings across environments in Django is to create a local_settings.py file and then copy into it environment specific settings during deployment.  Although much of our web work is done in Django now, the Rails way of managing environments is superior.

In your project, create a settings_env directory and put into it local.py, dev.py, etc files for environment specific setup.

## ^^ standard settings.py above
# Import environment specific settings
# Pull in the settings for specific environments
# It's the last argument.
env = os.environ.get('DJANGO_ENV')

if env == "production" : from settings_env.dw      import *
elif env == "staging"  : from settings_env.staging import *
elif env == "dev"      : from settings_env.dev     import *
elif env == "local"    : from settings_env.local   import *
else: 
    print "######################################################"
    print " No environment specified or specified environment"
    print " does not exist in /settings_env/."
    print " Continuing with no settings overrides."
    print " To specify an environment (e.g. production), use"
    print "  DJANGO_ENV=production ./manage.py runserver"
    print "######################################################"
    quit()
if DEBUG=True:
  ## ^^ settings.py for DEBUG = True below

Written by alson

October 7th, 2012 at 2:42 pm

Posted in Turbinado

Tunnel MySQL over SSH to remote server issue

without comments

There are a million pages about this, but I just bumped into a tricky issue and figured I’d drop a quick note about it.

First off, tunnel MySQL to the server (on a UNIXy box) by doing:

ssh  -L 33306:localhost:3306 your-server.com

Simply, that tells SSH to listen on local port 33306, a port chosen because it’s an obvious variation on MySQL’s default of 3306).  When something connects to that port, SSH will accept the connection and will forward it to the remote host, which will then connect it to the appropriate host and port on the far side.  In this case, we’re asking the server to connect to port 3306 on localhost, but you could connect to any server and any port.

The tricky issue was that on my Debian laptop, MySQL uses socket file for communcation even if you specify a port.  So this will fail and have you talking with your local MySQL instance:

mysql --port=33306  -u your_remote_user -pyour_remote_password

In order to force MySQL to use TCP (instead of sockets), you can force the connection protocol or specify a host (note: you can’t use ‘localhost’; you need to use 127.0.01):

mysql --protocol=tcp --port=33306 \
   -u your_remote_user -pyour_remote_password
mysql -h 127.0.0.1 --port=33306  \
   -u your_remote_user -pyour_remote_password

Written by alson

September 7th, 2012 at 2:29 pm

Posted in Turbinado

My Apache process is only using one core!

with one comment

I was recently working on a client site (a good-sized one) and was checking on the health of their application servers.  I noticed that each of their app servers was running a few of the cores much harder than the other cores.  This was in the evening and they get most of their traffic during the day; it runs Django under mod_wsgi in daemon mode with 8 processes and 25 threads per process.  Further, the boxes were not VPSs/VMs, but were dedicated, multicore boxes.  So they had multicore hardware and the web server was running in a multicore friendly way.

At the time, the load for each box was around 0.5.  And various process IDs rotated as the top CPU users, so process IDs weren’t bound to a core.  The ratio of traffic between the cores (ignoring actual core number and focusing on the utilization of each core, since each box was different) was something like:

Core # : Core Utilization

1      : 15%

2      : 2%

3      : 1%

*      : 0%

So why would one processor bear most of the load?  I googled and googled, and found little useful information.   I banged on one server with Apache’s benching tool (“ab”) while watching core utilization and, sure enough, all cores shared the load equally.  So what was going on?

I’m not sure if it’s the Linux kernel or a natural outcome of CPU caches, but the simplest explanation is that in low load situations processes are similar, due to cache coherence, will flock to the same core.  Rather than spreading a set of processes across a set of cores that don’t necessarily share the same cache, processes naturally gravitate to the cores that experience the lowest cache misses.

Upshot: it’s rational for the system to schedule most operations of a process or group of similar processes on one core when a system is relatively lightly loaded.  This is especially true if the cores are “Hyperthreading” and are sharing resources (read: their caches)!

Written by alson

January 4th, 2012 at 11:41 pm

Posted in Geekery

ExtJS 4 Models + node.js

with 9 comments

Finally starting to play with node.js.   Also, getting back into developing with the lovely ExtJS.  ExtJS 4 added strong support for client side models.

I was thinking that it’d be nice to share a lot of code for models between the client and server.   Turns out that it’s not that difficult.   Super quick-n-dirty code below.  Now the question is: how much duplication can be removed from client and server models? I don’t want to include all server-side code in client-side code, so might do something like:

  • /common/models/user.js – ExtJS model;
  • /client/models/user.js – tune the model for client side (e.g. add a REST connection to the server);
  • /server/models/user.js – includes client/models/user.js;  overrides critical bits (e.g. the Proxy); adds a bunch of server specific code.

If all of my models are in model.*, then I can probably iterate through them and auto-generate Mongoose models when the server boots…  Fun.

This is definitely a hack, but isn’t as frighteningly ugly as I expected:

fs = require('fs');

// stub out a fake browser for ext-core-debug.js
navigator = {};
window = {
  navigator : 'Linux',
  attachEvent: function() {return false;}
};
navigator = {'userAgent' : 'node'};
document = {
  documentElement:'',
  getElementsByTagName : function () {return false;}};

// Helper function 
function injectJS(f) {
  eval(fs.readFileSync(f, encoding="ascii"));
}

//Pull in ExtJS components
injectJS('./ext-core-debug.js');
injectJS('./src/util/Filter.js');
injectJS('./src/util/Sorter.js');
injectJS('./src/util/Observable.js');
injectJS('./src/data/Connection.js');
injectJS('./src/Ajax.js');
injectJS('./src/util/Stateful.js');
injectJS('./src/util/Inflector.js');
injectJS('./src/util/MixedCollection.js');
injectJS('./src/data/ResultSet.js');
injectJS('./src/data/Batch.js');
injectJS('./src/data/Reader.js');
injectJS('./src/data/JsonReader.js');
injectJS('./src/data/Writer.js');
injectJS('./src/data/JsonWriter.js');
injectJS('./src/data/Errors.js');
injectJS('./src/data/Operation.js');
injectJS('./src/data/Proxy.js');
injectJS('./src/data/ServerProxy.js');
injectJS('./src/data/AjaxProxy.js');
injectJS('./src/data/RestProxy.js');
injectJS('./src/data/validations.js');
injectJS('./src/util/Date.js');
injectJS('./src/data/SortTypes.js');
injectJS('./src/data/Association.js');
injectJS('./src/data/Types.js');
injectJS('./src/util/Observable.js');
injectJS('./src/util/HashMap.js');
injectJS('./src/AbstractManager.js');
injectJS('./src/PluginMgr.js');
injectJS('./src/data/Field.js');
injectJS('./src/data/BelongsToAssociation.js');
injectJS('./src/data/HasManyAssociation.js');
injectJS('./src/data/PolymorphicAssociation.js');
injectJS('./src/data/Model.js');
injectJS('./src/ModelMgr.js');

// Register the model
Ext.regModel('models.User', {
      fields: [
          {name: 'name',  type: 'string'},
          {name: 'age',   type: 'int'},
          {name: 'phone', type: 'string'},
          {name: 'alive', type: 'boolean', defaultValue: true}
    ],
    validations: [
        {type: 'presence',  field: 'age'},
        {type: 'length',    field: 'name',     min: 2}
    ],
    changeName: function() {
        var oldName = this.get('name'),
        newName = oldName + " The Barbarian";
        this.set('name', newName);
    }
});

// Create an instance
var user = Ext.ModelMgr.create({
      name : 'Conan',
        age  : 24,
        phone: '555-555-5555'
}, 'models.User');

// Use the instance
user.changeName();
user.get('name'); //returns "Conan The Barbarian"
user.validate();
user.addEvents('changed');
user.events;
user.fireEvent('changed', 'my hair');

repl = require("repl");
repl.start('ExtJS> ');

Written by alson

February 22nd, 2011 at 2:50 pm

Posted in Geekery

KickLabs (SF Incubator)

without comments

Great incubator just opened in downtown San Francisco: KickLabs.  Ridiculously great space, a great team and a list of exciting events.  Definitely a place to get to know.

And they welcome entrepreneurs of all ages!

Written by alson

July 16th, 2010 at 2:18 pm

Posted in Geekery

Getting high on your own stash…

without comments

Lots of talk around right now about how distracted we are by technology. Too much reporting-speak about it, though. Here’s a different, more personal take:
http://tweetagewasteland.com/2010/06/say-hello-to-my-little-friend/

Great quote:

When the WiFi went down during the official iPhone 4 demo, didn’t you sort of wish Steve Jobs would turn to the crowd and say, “You know what, let’s just talk.”

Written by alson

June 18th, 2010 at 5:24 pm

Posted in Turbinado

NXDom rocks…

with one comment

A couple of months ago, Johann Rocholl announced NXDom.  Basically, Johann created an algorithm to build millions of likely-to-be-interesting possible domain names and then queried DNS servers to see if the servers knew of the possible domain names.  Any domains that got NX (non-existent domain) responses from DNS servers were entered into the NXDom database as available.  Johann added a nice query interface and, voila!, changed how I search for domains.

Search for Domains V1

I don’t have any particular recipe for searching for domains, but I do something like:

  1. Think about my target market.
  2. Think up a few keywords.
  3. Try the keywords at http://www.instantdomainsearch.com.
  4. Find most are registered.
  5. Think up other less interesting keywords…
  6. Find one kinda crappy domain name that isn’t registered and register it because I’m frustrated…
  7. Dwell on the fact that I had a crappy domain name…

Search for Domains V2 (New and Improved with NXDom)

  1. Think about my target market.
  2. Think up a few keywords.
  3. Try the keywords at NXDom.
  4. See 300 results for each keyword.
  5. See lots of possibilities for great domains for each keyword.
  6. Register a great domain (making sure to click the NXDom affiliate link so that Johann gets his rev share).
  7. Dwell on my super cool domain.

The best part of NXDom is that it works very well for very popular terms.  A friend is in the CFO business, so we searched on domains ending in “CFO” and found numerous interesting domains:

FWIW

I don’t know Johann and I have no affiliation with NXDom.  Everyone to whom I show NXDom gets wide-eyed and starts thinking up domain ideas, so I figured that I’d share more broadly. NXDom has certainly changed, and greatly improved, the way I search for domains.

Written by alson

March 29th, 2010 at 1:48 pm

Posted in Tools

How do you maintain type-safety in web applications?

with 2 comments

Note: this post really concerns statically typed languages, but my little bits of example code are in Ruby since that’s the MVC implementation I know best.

I like MVC-ish-ness for web applications

The MVC-ish aspect of Rails is much clearer than the ASP/PHP-ish world’s chaos, so I’m bought into MVC-ish as a better way.  Separating the request processing into MVC generally gets me into the right mindset when I’m looking at individual bits of code. That said, Rails’ MVC still falls to satisfy fully my cravings for a web framework.

I don’t like the lack of type-safety

As Curt and I were discussing over in my last post, I am bothered by the lack of type-safety in MVC web frameworks.  In most MVC frameworks, type-safety is ignored or broken, e.g.:

  • Rails is basically not typed and Views can access the Controller’s instance variables;
  • ASP.NET uses an object called ViewState to pass data from Controllers to Views, but you have to cast ViewState to the right type in the View [yes, I’m treating ASP.NET as MVC…];
  • Turbinado copied ASP.NET’s ViewState and inherits its issues.

I’d like to figure out a way to maintain type-safety within my web application.  In the case of dynamic languages (e.g. Ruby), I’d like “type-clarity” so that I’m _nearly_ guaranteed of having the right type in the right place. For example [using Ruby syntax]:

The header of a web app might have a account section which behaves differently whether or not a User has been loaded into @user in the Controller.   It’s easy enough to have the section behave differently based on whether @user.nil?, but I can’t guarantee that, if the @user is not nil, @user is of type User.  *I* can certainly check the class of @user, but that leaves me open to errors at runtime when I _do_ change the type of @user and forget to update all of my type checks.

I want to be able to set the @user variable to a User in the Controller and have that type information enforced at compile time in the View.  Given an MVC-ish system, there seems to be two ways to do this:

  • Use different data types for each view and piece of a view.   This would be a nightmare since you’d have to define hundreds of separate types, build them differently and then send the right type to the right view.
  • Only ever create the values in question right near their usage, but this pushes Model fiddling and request params access down into the View, so doesn’t fit the web MVC style…

A Slight Modification to Rails’ MVC-ish-ness

Even if it isn’t the Rails MVC Way, it seems as though a reasonable way to maintain type safety would be to:

  • Don’t use Controller to set up variables for use in Views.
  • Mix more of the Controller functionality into the Models so that Models can manage creation of themselves.
  • Have Views use Models directly (and have Models cache/memo-ize themselves as necessary).
  • Use Controllers *only* for validation and routing.
  • Have Views play directly with the Models they require.

Currently, Rails’ MVC works something like the following:

---
some_controller.rb
class SomeController < ApplicationController
  def index
    @user = User.find(params[:user_id])
  end
end
---
some_view.html.erb

    <= HEY! Is @user **really** a User?

  Log in!

---

Instead, how about giving the Model access to the params and letting it use its business rules to set itself up:
---
some_controller.rb
class SomeController < ApplicationController
  def index
  end
end
---
some_view.html.erb

       <= HEY! That's almost guaranteed to work.

  Log in!

---
user_model.rb
Class User < ActiveRecord::Base
  def self.from_request
    find(params[:user_id])  <= the models is accessing the Environment...
  end
end

This is wrong, right?

Per Terrence Parr in Enforcing Strict Model-View Separation in Template Engines [PDF]:

[…] there is widespread agreement that keeping the view as simple and free from entanglements with model and controller as possible is a worthwhile ideal to strive for.

[…] Strictly separating model and view is an extremely desirable goal because it increases flexibility, reduces maintenance costs, and allows coders and designers to work in parallel.

But I think Parr’s point is really that business logic shouldn’t appear in Views not that UserModel access shouldn’t be done in a View.  Or is there some horrible breakdown that would occur if Models had access to request parameters and Controllers didn’t set up Models?

An issue that does arise with this modification is increased testing complexity when the Environment is accessible to Models.  Whereas now Models take a few parameters and then pass or don’t pass the test, allowing Models to access the Environment would mean that tests would have to set up request parameters, cookies, etc.  Simple, little, lovely Unit tests suddenly get larger and potentially uglier…

Type-safety + MVC ~= peanut butter & chocolate

Is type-safety a worthy goal to try to attain in MVC-ish web frameworks?  If so, how is the best way of doing so? In Rails, judicious application of filters and good naming schemes can get you “type-confident”, but you’re never type-safe.  Perhaps Rails has just the right blend for an MVC-ish web framework…

If MVC-ish web frameworks aren’t the way to go, what organization is better? Templating systems that only allow template markup/commands seem pretty safe, but I’d seriously miss the simplicity of mixed markup/code. ASP.NET compiles the template and then the “code behind” gets well-typed access to the XML, but I really prefer MVC to that organization.

Written by alson

March 18th, 2010 at 8:02 pm

Posted in Turbinado