Monday, February 9, 2009

Language-oriented programming: too much, too fast

There was some recent discussion on Hacker News about the 2004 article on "Language Oriented Programming" by Sergey Dmitriev. With my growing interest in programming languages, this article, and the pending release of JetBrain's Meta Programming System, I have been thinking a lot about the future of programming.

As Sergey points out, MPS is not the first entry into this paradigm of software development. Intentional Programming was being demonstrated by Microsoft Research as early as 2000. Wikipedia lists several implementations of the language-oriented programming concept. (OK, that is enough links for now!) Sadly, none of these systems have been met with wide spread success. Despite my unlimited respect for the JetBrains team and love of their products -- especially Resharper -- I expect MPS to fail to achieve critical mass. I don't think many programmers would disagree. It's just too much, too fast.

Software development needs to evolve, not start anew. If you change too much at once, the common developer simply won't accept it. Language designers are still re-inventing Lisp piece by piece in an effort to make sense of all the different concepts for Mort. In order to improve adoption of language-oriented programming concepts, the critical path must be identified and the programming community must be lead down that path one step at a time.

I believe that the first stepping stone is the box editor.

Both MPS and Intentional Programming provide a visual tree of boxes editor and store the abstract syntax tree in a sort of source code database. Both of these concepts are critical to power of language-oriented programming, but changing the storage of source code is simply too radical of a change. Well, actually, source code is routinely stored in databases for IDE features, but the authoritative storage always remains text files. All of a programmers most trusted tools operate at the string level and it is simply impractical to throw everything out at once.

This is why I propose the creation of a new editor which uses an abstract syntax tree box editing model, but preserves the source as text. This editor would have to play nice with other developers who are using traditional text editors. It should be possible for a single developer to try the editor for a day or two without another collaborator even noticing. There are clear benefits to a box editor even in the absence of language-oriented features. For example, navigation of source code by parent, child, or sibling relationships instead of by word or by line. Advanced renders can be used to layout math expressions or to show a referenced image file in a comment.

Editors are a religion, start a new one. If some box editor becomes popular enough, other tools would spring up around them. To help the process, the editor should have well exposed extension and composition mechanisms. For example, it should be trivially easy to add new renderers for various nodes. It should be equally trivial to utilize the abstract syntax tree system to create stand alone code analysis tools.

If box editors garner enough of a following, then the language-oriented programming advocates can seek the next stepping stone.

Tuesday, February 3, 2009

Break the cycle of broken builds

Broken builds suck. Breaking the build sucks. No one wants to have their work stopped by others' build breaks and no one wants to cause work stoppage on account of their own build breaks.

So then why do builds break so often? You'd think that after so many years of software development, someone would have solved this problem. Microsoft employees tens of thousands of engineers across hundreds, if not thousands, of code projects. Many of those code projects are development tools for the engineering community at large as well as engineers within Microsoft. Yet, still, all of the tools at the disposal of those engineers are highly prone to build breaks.

Below, are three ideas for reducing build breaks. Together, they could all but eliminate the problem.

Use a better version control system

Everyone keeps raving about distributed version control, and for good reason. I won't rehash the argument here, but I will say: branch early and merge often. Perforce (better known as Source Depot within Microsoft) and it's spiritual successor Team Foundation Server are completely unbearable. A great majority of build breaks can be contained if people are only pulling known good changes from dedicated integrators. Everyone checking in all at once to a main branch simply doesn't work with anything but the most tightly knit of teams. Even if the does build break, it should only block the person waiting on that particular change.

Isolate build breaks

`sd submit -c 12345` translates roughly to "I'm 100% confident that my changes are solid and am equally confident that I am submitting the changes that I think I am." You never can be 100% confident, don't take any chances: submit to a private branch which is being monitored by a build service. The server should detect your submission, run a full buddy build (with unit tests!), and then submit it to the shared branch only after it has passed.

If your builds take too long for this, people should be able to grab changes down from your unverified branch on an as-needed basis. If this happens too frequently: your builds take too long. Refactor your system into smaller components and formalize the contractual interface between those components. Then buddy build just the components which have changed. Verify that the interface hasn't changed with a unit test. This will stop compile errors at the boundaries.

Go lazy, be late

This one is a big of a pipe dream... software systems developed with dynamic languages do not experience build breaks nearly as often as those developed with static languages. This is because dynamic languages are typically lazy-loaded and utilize late bounded method invocations. If someone adds a new button to the UI which calls a method in a file they forgot to add to source control, that should only block people who want to click the button!

There is no firm technical reason why statically or JIT compiled systems can't simulate this. Most compile errors could theoretically be treated as a runtime error. Clearly, compile errors are valuable, but should be ignorable whenever possible. Don't ignore them in your known good builds, but ignore them when they stand between you and forward progress.

Since most of us aren't writing our own compilers are full tool chains, here is some more practical advice for projects using statically compiled languages: write code which fails gracefully. Runtime errors can be build breaks just the same as compile errors, but try to defer the negative consequences as much as possible. One broken feature shouldn't break the whole project; unless, of course, you are about to ship. In production builds, perform verification code at startup or in unit tests, but try to be lazy and late in development builds.