IoC, Functional Programming, and DSLs

The Inversion of Control (IoC) design pattern, functional programming, and Domain Specific Languages are all different faces of the same underlying concept and can be used interchangeably.  Recognizing this should make it easier to understand each one and when they should be used within your own code.

IoC simply refers to a library passing control to it’s caller.  A typical object-oriented implementation would be a function which takes an object that implements ISomething where ISomething has a single method that the function then calls.  Those familiar with the visitor pattern should recognize that the visitor pattern is just a specialization of IoC where the callers code is applied to each object in a collection (actually any structure, it doesn’t *have* to be a collection).

I think the tie between IoC and functional programming should be pretty obvious at this point.  If your language supports first class functions and closures, then creating an interface with a single method is just extra work when you can simply pass a function as a parameter.   In fact, since a closure will close over lexical variables to maintain their “state”, both types of IoC implementation (Interfaces and Functions) are exactly equivelant in terms of power.  I believe that using a simple function is more readable though, and it’s certainly more convienent for the caller to not have to create a class that implements ISomething.

The tie between DSLs and IoC is more abstract and requires exploration into the problems being solved.  IoC is a useful design when the problem is not completely specified.  If you are writing a library and the only thing that will vary in how it’s used is the inputs, IoC is just extra typing with no real benefit.  If your problem is only partially specified on the other hand, then IoC allows you to implement the bits of the solution that you can infer from the specs, and leave the rest up to the caller.  This is why very generic libraries tend to use IoC a lot.  A DSL typically provides the users with a base set of combinators (functions that create functions out of other functions) to build up functions that perform a specific type of task in a very high level manner.  A DSL for writing parsers should look very close to BNF so that a grammar can be encoded in the host language with as few implementation specific details as possible.  How to do this with combinators is an interesting topic itself that deserves it’s own post and maybe I will get motivated and write one some day.  A more intuitive example is configuration files, any time you have a configuration file that can change the flow of a program you are creating a mini-language and you are passing control to the end-user (ie.. IoC).

So great, they are the same thing conceptually, what do we gain from this?  Well we can transfer the knowledge we have from one paradigm (like OO) to another (like functional) and vice versa.  More concretely, by recognizing that they are all ways to partially specify a solution, we now have a good solid concrete criteria for deciding when one of these types of implementation are necessary.

On a side note, this should help highlight the difference between a pattern and an implementation.  Interfaces/Functions/Monads are ways to implement the IoC pattern.  Creating a library to “handle IoC” is confusing the two, and if you find yourself writing or consuming one of these libraries, then some reflection on how your handling the task at hand would probably do some good.

5 reasons to use Git instead of Subversion

Git is a new version control system originally written and designed by Linus Torvalds for managing the development of the Linux kernel.  Due to the chaotic nature and massive amout of contributers to Linux kernel development, Linus is quite arguably the most qualified person in the world to design a good VCS.  Git is quite a bit different than a traditional VCS such as Subversion and it doesn’t have the mathematical theory that darcs has behind it, but it is by far the most practical VCS I have used to date.  So anyways, here are my 5 favorite reasons to git.

  1. You always have a full copy of the repository with all the meta-data, online or off
  2. You can merge with anybody’s repository, not only a single central server
  3. Even with the full history, git repositories take less space than a SVN checkout
    brandon@Brandon:~/localtmp/clone$ du -sh *
    1.6G    cohiba.git
    2.6G    cohiba.svn
  4. Modifying the history is easy, using git rebase -i <some_revision> will bring up a list of patches in your favorite editor and allow you to merge/reorder/delete them, then it will apply them as you’ve specified, re-committing with each one
    pick e4eb2ca Added -d option to only look at 1 date
    pick 881ef7f Fixed bug that caused an Prelude.head: empty list exception if no work orders were pushed
    pick 3595183 Added line itemized reports on pushed work orders
    pick 1b6ff52 Fixed the parser so that it doesn't break if there is a [] in the stack trace It was using a bit of a hack, but i've cleaned up the parser so there should be no more suprises, no matter what we throw at it
    pick 12b7023 Changed slurp_until to fail on eof
    pick e511b15 New log file for tests
    
    # Rebase 378dd8b..e511b15 onto 378dd8b
    #
    # Commands:
    #  p, pick = use commit
    #  e, edit = use commit, but stop for amending
    #  s, squash = use commit, but meld into previous commit
    #
    # If you remove a line here THAT COMMIT WILL BE LOST.
    # However, if you remove everything, the rebase will be aborted.
  5. You don’t need a centralized server with a single ‘master’ copy of the codebase with git, but that is  still how most software shops will operate.  Any mucking with this master copy would be noticed by all because git revision id’s are hash sums of the entire history up to that point.

bnIRC 1.0 released

I’ve finally released a 1.0 version of bnIRC.  It’s available on sourceforge as a .deb or a source package.  Windows builds are planned but will have to wait until a future release, hopefully not too far out.

New life in bnIRC

I’ve decided to breath some new life into my IRC client. I’m using a few people on freenode to test test test so that I can make it stable. The goal is to have a rock-solid (no new features) 1.0 release within the next few weeks and afterwards I’ll be adding some features that have been requested.

I created a git repot on github.com. Tarballs/zip files can be downloaded here. Or if you have git installed you can run:

git clone git://github.com/bniemczyk/bnirc.git

If your interested in helping test please do! Make sure that the python and ncurses header files (python-dev and ncurses-dev on debian) are installed before you run ./configure, or you’ll get a very bare-bones interface that is really only meant for debugging.

Quick and dirty intro to Lambda Calculus – Post #3

This is a continuation of my previous post.

In order for \lambda-Calculus to be useful, we have to be able to encode data. In our previous post we used integers and booleans without any explanation as to how they may be encoded. This picks up and fills out those details.

Booleans

It is useful to encode boolean values as functions that take 2 parameters and then evalute the correct argument.

T \equiv \text{True } \equiv \lambda xy . x

F \equiv \text{False } \equiv \lambda xy . y

Assuming we have a boolean P we can encode an if-then-else function as PAB where A is the result if P \equiv T and B is the result if P \equiv F ie..

TAB \equiv A

FAB \equiv B

Church Numerals

\text{Zero } \equiv \lambda fx . x

1 \equiv \lambda fx . fx

2 \equiv \lambda fx . ffx

3 \equiv \lambda fx . fffx

n \equiv \lambda fx . f^nx

And now to show the basic operations on numerals

Increment

\text{Inc } \equiv \lambda nfx . f (nfx)

Example :

\text{Inc Zero } \equiv \lambda nfx . f (nfx) [n := \text{Zero}]

\equiv \lambda fx . f (\text{Zero } f x)

\equiv \lambda fx . f ((\lambda fx . x) f x)

\equiv \lambda fx . f x

\equiv 1

Decrement

\text{Dec } \equiv \lambda nfx . n (\lambda gh . h (g f)) (\lambda u . x) (\lambda u . u)

Example:

\text{Dec } 1 \equiv \lambda nfx . n (\lambda gh . h (gf)) (\lambda  u . x) (\lambda u . u) [n := \lambda fx . fx]

\equiv \lambda fx . (\lambda fx . fx) (\lambda gh . h (gf)) (\lambda u . x) (\lambda u . u)

\equiv \lambda fx . (\lambda x . (\lambda gh . h (gf)) x) (\lambda u . x) (\lambda u . u)

\equiv \lambda fx . (\lambda x . (\lambda gh . h (gf)) (\lambda u . x)) (\lambda u . u)

Addition

\text{Add } \equiv n + m \equiv \lambda mnfx . m f (n f x)

Example:

1 + 1 \equiv \lambda mnfx . m f (n f x)[m := 1][n := 1]

\equiv \lambda fx . (\lambda fx.fx) f ((\lambda fx.fx) f x)

\equiv \lambda fx . (\lambda fx.fx) f (fx)

\equiv \lambda fx . ffx

We can show that \text{Inc } \equiv \text{Add } 1

\text{Add } 1 \equiv \lambda mnfx . m f (n f x) [m := \lambda f . fx]

\equiv \lambda nfx . (\lambda f . fx) f (n f x)

\equiv \lambda nfx . f (n f x)

\equiv \text{Inc }

Multiplication

m * n \equiv \lambda mnf . n (m f)

Example:

2 * 3 \equiv \lambda mnf . n (m f)[m := 2][n := 3]

\equiv \lambda f . (\lambda fx. f^3x) ((\lambda fx . f^2x) f)

\equiv \lambda f . (\lambda fx . f^3x) (\lambda x. f^2x)

\equiv \lambda f . (\lambda x . f^6 x)

\equiv \lambda fx . f^6 x

\equiv 6

Quick and dirty intro to Lambda Calculus – Post #2

This is a continuatin of my previous post.

In this post and all the following posts I write, all \lambda-calculus expressions are left-associative, ie..

ABCD = ((AB)C)D

In this post I will expound upon some of the things I glossed over in the last post.

Variable Bindings

The \lambda operator binds a variable. If a variable occurs in the body of an abstraction but is not bound, then it is called free. In the case of name clashes, variable x is bound to the inner most \lambda x.

\alpha-conversion

The easiest way to ensure that variable names don’t have scope issues is to simply rename all bound variables to a unique name:

\lambda x . \lambda x . (\lambda x . x) x \equiv \lambda x_1 . \lambda x_2 . (\lambda x_3 . x_3) x_2

\beta-reduction

What I referred to as simplifying in the last post was actually a \beta-reduction. \beta-reduction is the act of making all the applications in an expression that do not use any bound variables. If all bound variables in expression have unique names, then a beta-reduction is very simple:

(\lambda A . B) B reduces to B[A := B]

The \alpha-conversion described above is very important because if variable names clash, then a \beta-reduction is much more complicated. Take the following term:

(\lambda ab . ab) (cb)

Without first applying an \alpha-conversion we might end up with something like this:

wrong:

\lambda ab . ab [a := cb]

\equiv \lambda b . (cb) b

Now our free variable b in (cb) is incorrectly bound. Instead we should get this:

correct:

(\lambda ab . ab) (cb) \equiv (\lambda a_1 b_1 . a_1 b_1) (c_1 b_2)

\equiv \lambda a_1 b_1 . a_1 b_1 [ a_1 := c_1 b_2 ]

\equiv \lambda b_1 . (c_1 b_2) b_1

Now our newly renamed b_2 is still a free variable.

That’s it for this post, in an upcoming post I will explain how to encode some basic datatypes using pure \lambda-calculus.

Quick and dirty intro to Lambda Calculus – Post #1

This is the first of a multi-part post about \lambda-Calculus.  I’m going to start with a description of the Untyped \lambda-Calculus.  Future articles will include Church Numerals and Typed \lambda-Calculus.

Why?

\lambda-Calculus is important because it has a close tie with functional programming.  In fact most functional languages compile to some form of \lambda-Calculus.  You could say that \lambda-Calculus is to functional programming what the Turing machine is to imperative programming.

The very basics

Everything is a function in \lambda-Calculus but for the purposes of this post we will extend the core calculus with integers and basic integer operations.  In a future post we will go into how integers can be represented as functions (for those unwilling to wait, check out Church Numerals).

Lets start with a sample expression:

\lambda x . x+y

This is called an abstraction.  In this expression we call the x variable bound and the y variable free.  There is one other operation called application.  Application takes the form of:

\lambda x . x+y [x := 3]

\lambda-Calculus evaluates expressions by simple substitution so the above expression simplifies to 3+y.  One thing to note about this is that in the term \lambda x . M where M is another term, only free occurences of x get replaced.

\lambda x . \lambda x . x+z [x := 3] therefore simplifies to \lambda x . x+z.

All functions are functions of one variable

\lambda-Calculus does not allow abstractions that take multiple arguments.  Instead you write functions that return functions which take the next variable.

\lambda x . \lambda y . x + y

simplifying this with x = 3, y = 4 we get:

\lambda x . \lambda y . x + y [x := 3][y := 4]

\lambda y . 3 + y [y := 4]

3 + 4

7

For conciseness, \lambda x . \lambda y . M can be written as \lambda xy . M but it is important to remember that this is a function that takes the value for x and returns a new function.

The Y combinator

It may have occured to you that recursion cannot be implemented with a simple substitution model and there are no looping constructs.  How can this be Turing Complete?  Recursion can be simulated by using a function Y such that Yf \equiv f (Yf) (ie. a fixed-point function), assume for the moment that such a function Y exists.

If you wanted to represent the factorial function with recursion you could do it as:

\text{Factorial }x \equiv \text{ if } x == 0 \text{ then } 1 \text{ else } x * \text{ Factorial }(x-1)

Since terms are not named, and recursion is impossible, we have to write a function that takes (Yf) as a parameter.

\lambda f . \lambda x . \text{ if } x == 0 \text{ then } 1 \text{ else } x * f (x - 1)

we’ll call this expression \text{ Fact } for brevity.

Since: Yf \equiv f (Yf)

Y \text{Fact} \equiv \lambda x . \text{ if } x == 0 \text{ then } 1 \text{ else } x * f (x - 1) and f is bound to Y \text{Fact} giving us recursion!

Now to show the (not-so-magical) fixed point function Y:

Y \equiv \lambda f . (\lambda x . f (xx))(\lambda x . f(xx))

Proof

Y g = (\lambda f . (\lambda x . f (xx)) (\lambda x . f(xx))) g

Y g = (\lambda x . g (xx)) (\lambda x . g(xx))

Y g = g [ ( \lambda x . g(xx)) ( \lambda x . g(xx))]

Y g = g (Y g)

Taint mode doesn’t catch SQL injection bugs by default

After a friend pointed out an SQL injection bug in some of my Perl code, I was confused as to why taint mode had not caught it (I use -T for everything and recommend you do too). After checking the docs again, DBI only checks if statements are tainted if you set the TaintIn property of the handle. I’m not sure how I missed this before. Personally, I think this is a brain-dead behaviour. If a programmer is security conscious enough to enable taint mode, he probably wants to know about SQL injection bugs.

Update:
After enabling TaintIn for some of my code, DBI became even more retarded. With TaintIn DBI also checks if your sql parameters are tainted. So some code like $sth = $dbh->prepare(”insert into mytable values (?)”); $sth->execute($user_supplied_data); fails. In my opinion this defeats the entire purpose of DBI parameters. I may patch DBI locally and see if I can get it applied upstream.

Perl Grep Replacement

The script is a perl replacement for grep. It has a couple extra features that I like, and of course the best
benefit is it uses perl regular expressions instead of posix.