Overly Sharpened Blog

They never told me what happens when you sharpen the saw too much...

Other Games of Life

I've been playing with duplicating some of the results on the emergence of cooperation in variations of iterated prisoner's dilemma.

An interesting (if somewhat trivial) thing to note is how easily behaviour similar but distinct from the Game of Life. A small grid randomly populated only with Defectors and Cooperators, for instance, often produces glider-like patterns.

My goal is to implement some dynamic programming approach to generating strategies, and the attached source betrays that goal with some complexity that is unnecessary for the behaviour I'm talking about, but anyways.

Minting is Rare

Or, "How to resolve revocation with an immutable capability-secure world."

A path is stored not by minting a new reference to the target (a hardlink), but rather by storing the path itself (a symlink). Each segment of a path represents a node that serves as a proxy for the rest of the path (onion routing).

Now, where does this leave me if I want permissions to be baked into a capability, and generally immutable?

On the one hand, I can now manage mutability in a sane way in the UI layer, because now the size of local neighborhood is under the control of node. This is bit of a return to the heavier weight approach to linking that I was originally considering, although it still only requires write access to one side of think, plus mint access (which could be a limited form of write in the mutable case, but it doesn't have to be).

On the other hand, immutability is really really nice. Specifically, being able to decode the permissions and determine if an operation is allowable offline is a big win, as it allows for some fairly aggressive caching even in the worst case, and in the best case may actually reduce the time-complexity through memoization.

Notes on the Implementation of a Blog

Publishing is currently a three step process consisting of writing an entry perfectly, publishing it to a staging point, and then committing contents of that stage point to the real blog. This causing me a small amount of grief:

  • Links don't work on the staging site

  • Comments are published to the staging site rather than the front page (this is not quite a feature)
Approaches I've considered to fix this:

  • Just use Blogger as intended

    Not going to happen because of principles I'll explain some other time

  • Rewrite the content as Blogger uploads it to the staging site

    A bit more tempting, although it's brittle. More importantly, it opens me up to all sorts of security vulnerabilities that I'd rather avoid in what should be a rock solid "deploy" script.

  • Replace Blogger with something I write myself

    Now we're getting somewhere!

However, understand that when I say "replace blogger", I'm not starting from square one. Some time ago, I actually used a framework of my own invention based on a capability security model inspired by Richard Kulisz (don't ever tell me I didn't give credit where credit is due :p). The problem being that I wasn't following my own advice, and didn't have a backup of the material in a usable form when the inevitable happened. Ah well, live and learn, it wasn't that fun to work with anyway.

The idea here is to do it again, but with a focus on creating on top of that architecture that ends up feeling more like a blog than a wiki.

So, how do you create a blog on top of a CSM architecture? Why, I'm glad you asked...

A question about PyPy's JIT

Although I'm sure this is already obvious to the PyPy people, I'm quite interested to see how close they are to a system that would be capable of efficiently executing interpreters written on top of the existing system.

PyPy is a python implementation written in python. The translation and jit architecture (as I understand it) uses manually inserted hints [pdf] to indicate what variables belong to the interpreter vs the interpreted program, so that the jit can accurately determine when the interpreted program has looped (as opposed to the interpreter itself). This is important because, in general, optimizing the code executed of the interpreter has fairly limited gains: you gain a faster interpreter, but execution is still interpreted. The hints allow the system to distinguish between the accidental work of interpretation from the essence of what is being interpreted.

But it's not recursive. Even though the runtime has the required logic, it is missing the hints, and so an interpreter running on top of this stack will run faster, but it won't make the jump to direct execution.

The question that intrigues me is this:

Would it be possible to generate those hints dynamically to make the gains available to higher level interpreters?

Wait, I know what happens next...

Interesting service being launched today by Wolfram Research

Any teenager is capable of making a machine from scratch that can seriously injure a person. The difference between a motor that you can stop with a bare hand, and one that will simply take your hand off can remarkably incremental.

I wonder how close we have to get to that threshold in order to have enough experience to see how close we are.

No, I don't think the imminent launch implies an imminent hard takeoff. I just find myself wondering if we'll recognize the potential of the technique, or whether we'll just make two big lumps of fissile uranium by accident ("Oooo! Shiny!") and get on with the business of bashing them together.

Kelly criterion

Aside from the obvious issue of compensating for errors, I don't have a good intuitive understanding of why one wouldn't want to maintain a bet as close to the kelly bet as possible.

It seems that the usual complaint seems to be that you want to minimize your downside risk in the short term, and a kelly bet is concerned only with maximizing the long term gains.

What doesn't sit well:
  • "Short term" is already well known to be a bad measuring stick. You measure your progress over months and years, not days and weeks. Optimizing for the short term isn't much different than sitting down at a $1-$2 no limit poker game with your last $50, hoping that you can play so conservative that you somehow stave off the inevitable ruin.

  • I suspect that the simple explanations are applying a rule-of-thumb in order to model the needs of a career player: the competing constraints of paying the bills while also growing the bankroll. My intuition is that including this explicitly, or baking the effect into an "effective bankroll" (instead of a fudge factor on the ideal bet) would give a better over all result, or at the very least, a more intuitive explanation.
I need to think about this more.

Card-counting and evolutionary algorithms

I'm a bad blackjack player. Bad enough that I refrain from playing anywhere except a friendly home situation with no money at stake, where I definitively demonstrate how bad I actually am.

That said, I find myself intrigued:

Given a population of players based on existing counting techniques, with crossover and mutation, and a fitness function that included optimizing for function size and minimum worst-case runtime memory usage, what sort of counting technique might we turn up?

HTML is to the web as assembler is to computing

I'm kinda surprised that there haven't been more projects taking a translation/compilation approach to working around IE's rendering deficiencies. We have a good specification of how things are supposed to work, and many years of experience with how they actually work in various browsers.

Is it that the folks interested in compilers and such just aren't interested in web technology? (Well, when I put it that way...)

IE6 CSS Fixer is an example of the sort of approach that I think could yield big benefits, especially with guidance from someone with some compiler-writing experience. Splitting the general problems of graphic design and application development from the tinkering needed for cross-browser issues would mean one less rather annoying pebble in many folks' shoes.

Sustaining my messy approach to browsing

My messy browsing habits may yet become sustainable!

Mozilla wiki on multiple processes

I'm a messy guy.

I know exactly where any one of several hundred index cards, articles and scraps of equipment are at any given point. And like any self respecting geek, my electronic life is in all ways a reflection of my real world self.

Which implies that my browser profile is a mess too.

Now, you're thinking "Ha, he's got a million and one bookmarks just like me". And you're right, but that's not what I'm talking about.

At this moment, I have one hundred and seventy two tabs open. This places me firmly into the category of people who: (A) can tell you that fasterfox, swiftfox, and so forth are placebo, (B) Firefox's memory handling is considerably better than the lay press makes it out to be (image decoding aside), (C) single threaded event oriented cooperative threading does not scale in the presence of mutually indifferent parties, let alone hostile parties.

Hearing about Google Chrome's approach was a breathe of fresh air, to see that some the OS/VM/Runtime-like nature of a modern browser was finally being recognized. Yes, I'm an odd use-case, but I can assure you that handling these issues at the architectural level going to address a multitude of seemingly impossible to pin down behaviors that users have been complaining about for years.

Seeing the Mozilla team taking the approach into consideration is a huge step towards solving the underlying problem, regardless of whether they actually end up using this particular approach. The important thing is that it's being considered as an architectural issue, and not just something to be blamed on the websites themselves.

Minimum Standards

You're a knowledge worker.

A fancy term that just means you use your computer for actual honest real creative work. Not talking about time-sheets and a contact list here. That spreadsheet that serves as your companies ERP. The irreplaceable original files making up your portfolio. The curriculum for the class you're teaching. The knowledge that you've gained and reified into something communicable. These things have value. But only as long as they exist.

This feels like a good time to point out that your hard drive is probably going to die this year. "Oh."

Okay, maybe not this year. Really, it's only a 5% chance or so. But a 5% chance of an unrecoverable loss of data is enough to keep me up at night. There are many things that can cause you grief in this department.

The point is to acknowledge that the universe tends towards maximum irony and to start acting like failures are expected, so that when they happen you're not left with the choice of redoing a month's work, or a $5,000 recovery bill, or simply being forced out of business because that information was both truly irreplaceable and irrecoverable. All you need is a minimum standard of care. You wear your seatbelt when driving. You have working smoke detectors where you sleep. And you have automatic nightly backups of your data.

Well, you will, soon enough. :)

  • Backups

  • The common wisdom: "You need backups! What if the building burns down?"

    This is not why you should have backups. The common wisdom is really just a ready-made statement to show that you think about Big Problems, while giving you an excuse procrastinate and generally ignore the issue. Yes, it's a problem that should be addressed, but our goal today is the minimum standards of hygiene, and worrying about redundancy and geographic distribution and backup windows is just going to overwhelm you and give you an excuse to give up on the whole thing. And we're not giving up today.

    No, you're going to get your backup situation figured out because your hard drive is going to die this year. Yes, really. Hard drives have lifetimes measured in years, not decades. And your machine isn't exactly new, is it?

    We're going to do a daily backup. Backups are annoying because they take a long time to copy everything, and the machine is slow due to the extra load on the disk while they're running. But if you back up every day, then you only ever have one day's worth of extra data to move.

    We're going to back up everything. Every day. Getting selective and only backing up particular files is a very good way to ensure that you're missing only the most vital files. Trust me, you don't want there to be any question that that work you did last week for the first time in a new program is being included.

    And, we're going to back everything up every day, automatically. It's vitally important that the backup happens whether or not you remember to start it. And running it by hand will tempt you into doing changing the process by hand, and for this task inconsistency is your absolute enemy.

    We want an automatic process that backs everything up every day.

  • So, what tool do we use for this?

  • The common approach is to use a nice point-and-click tool to run the backups. You should not use one of these tools.

    Transparency is your ally in this task. You need to understand each link in your backup process in as much detail as you can, and this means minimizing the number of links in the chain. Point-and-click tools excel at creating intricate setups that are not the simplest thing that could possibly work.

    You're going to need three things. You already have two of them.

    • External Drive

    • You need an external drive that plugs into your computer using a USB cable or similar. This shouldn't set you back more than $100-$150, but it's not optional.

      Backing up to CDs or DVDs practically guarantees that you won't perform the backups on a regular basis, and makes the whole process far for painful than it needs to be.

    • Scheduler

    • On any modern unix (Linux, BSD, Apple's OS X, and so on), the scheduler will be cron. Commonly, there will be a folder called /etc/cron.daily, and any script placed in that folder will be run once per day at a suitable time. Exactly what we need.

      On windows, there's typically a built-in scheduler service which is adequate for the task.

    • Copier

    • Again, the tool we need is already available on any modern unix. The rsync tool will reliably copy everything, automatically checking that everything was written correctly, and keeping any special data necessary that other tools may not include.

      On windows, I'd strongly recommend grabbing a copy of rsync from cygwin or wherever.

    What we want is a very simple script, so simple that you can understand it.


    # -v Print the names of the files to the screen as we back them up.
    # -a Do the things necessary to give a nice complete archive of a set of files:
    # -x Don't go exploring mount points that we run across
    # --delete Delete files files from the backup if they're no longer found.
    # / The source: copy everything from the root drive.
    # /media/disk/backup/root The target: copy everything to here.
    rsync -vax --delete / /media/disk/backups/root

    Save this in a file called "backup", and added it to your /etc/cron.daily folder. Tomorrow morning, check that your external drive has a copy of all your data on it, and bask in a warm glow knowing you're doing better than 90% of your peers.

    Much better, right?