Jacob and The Man in Black

By Drew McCormack

Like most scientific developers, I've used statically-typed programming languages like Fortran and C++ for most of my professional life. But the last few years I have moved more and more of my coding to dynamic languages like Python and Objective-C, and I have come to prefer dynamic typing to static typing. I am by no means alone in this, with a trend to languages like Ruby and Python in fields as varied as web applications development and scientific data mining.

The reason for this post is that I have been reading the new Pragmatic Programmer's book Code in the Cloud by Mark C. Chu-Carroll, which provides an introduction to cloud computing with Google App Engine. On the whole, it's not a bad introductory text, but one aspect of it riles me: it denigrates Python, and dynamically-typed languages, in general. In 2010, I find this position quaint to say the least. In 1998, it would have been the status quo, but to find an author who still thinks like this about web development is surprising, especially given that Google App Engine is a poster child for dynamic scripting languages in the cloud.

I know it can be hard when you first break away from static typing. It feels a bit like your safety net has been taken away. In reality, you are just being exposed to risks that were there all along. Static typing gives the impression of safety, because it seems like the compiler is looking out for you, but for the most part it is just catching a few innocuous syntax problems. And most of those problems actually arise due to extra code that you have had to write simply to give the compiler the static information it needs to pass judgment.

Statically-typed languages like C++ and Java are large and complex, because they need to provide the means to instruct the compiler what constraints it should check. You typically end up writing much more code than is actually needed to describe the function of your program, and most of the errors that arise during compilation are actually from this 'excess' code, which would not even be there in a dynamically-typed language. It feels like your code is solid, because the compiler found a bunch of mistakes you made in your variable declarations, when in fact this information was extraneous to begin with.

Of course, you can also catch some errors at compile time using a static language that would not be found until run time in a dynamic language, but these are fewer than you would expect. More importantly, a compiler can only catch static programmer errors, and you still have to deal with the more dangerous runtime errors, which can arise in any language. The risk with a static language is that it gives the perception of safety, and may distract you from runtime bugs, which are more serious, and harder to track. Runtime errors are the domain of testing, formal or otherwise, and if you are too busy tracking trivial, self-induced bugs via the compiler, you may not have the time or inclination to do serious testing, which is much more important.

Of course, I am not advocating that you never use statically-typed languages at all. They still have their place, it's just a smaller place than you might think. For highly performance-critical code, C++ or Fortran can be justified, for the compile time optimizations they bring about. But even then, the best solution would probably be a small core of C++/Fortran in an app predominantly written in a dynamic language.  

PS The 'Lost' reference in the title should be self explanatory for those who follow the series, but I'll leave it to you to decide which is which. For those who don't follow the series: You're late!

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

I just have one question:

I just have one question: the man in black is dynamically-typed, right?

well, it depends what one

well, it depends what one understands as 'scientific development'. in what i do (computational physics), sth like 80% of the code is performance critical and basically all one could do e.g. in python is a kind of a front-end which reads input parameters, calls a computational (C/C++) routine and writes the results.
but if all you need is to make a few calls to the lapack library, of course it'd be better to just use python or similar...

I don't mind the advent of

I don't mind the advent of dynamic typing as much now that I'm starting to get used to it. However, I do wish there were a way to indicate in dynamically-typed languages what variables I'm actually using. It's a pain in the neck to have to debug a "working" code that returns garbage results and find out that the error is caused because I accidentally typed "countt += 1" instead of "count += 1" halfway through.

In Python you can use the

In Python you can use the 'who' command to see which variables are currently used in the namespace. I also use Spyder, an free, open source, and interactive MATLAB-like GUI for Python, which shows the currently used variables in a workspace similar to MATLAB.

Re which is which: The

Re which is which: The answer is as obvious as it is clear which is good and which is evil in the series. I'll say no more (...because I can't).

Drew

---------------------------
Drew McCormack
http://www.mentalfaculty.com
http://www.macanics.net
http://www.macresearch.org

I think "who" is just a

I think "who" is just a command within the numpy module, and applies only to numpy array data.

On the other hand, dir() will give you a list of all variables in the current namespace, and dir(name) will give you a list of variables in the module "name"... but I don't think it will help at deeper levels (e.g., trying to figure out what are the current variables inside a function).

Peter Erwin

Of course, statically typed

Of course, statically typed language have their place. Even for a complete program.

For instance in scientific computing, Fortran remains a good choice for its simplicity (think of it, the "sin" or the "atan" functions do not need external dependencies) and I would say it is even more important to have a core language such as Fortran available on "the cloud" (I used the word for the hype only, I meant it is important to me that my code compiles right away on a university central cluster).

But again, as a glue language, for GUI applications, for non-performance critical code and for prototyping a language like Python excels.

The flexilibily given by the existence of both static and dynamic languages is a real bonus.

Drew said, "On the whole,

Drew said, "On the whole, it's not a bad introductory text, but one aspect of it riles me: it denigrates Python..."

While this is tangential to the main point of this article, I have to say that I'm not a fan on Python. I heard all of these wonderful things about it, so I decided to pick up the O'Reilly book and learn it. 5 minutes later, after learning that blocks were controlled by indentation, I very quickly ended any attempt to learn Python. I'll stick with Perl...

Kevin

It is very difficult to

It is very difficult to write robust, maintainable code in an untyped/dynamically typed language.

Contrary to popular opinion, untyped, interpreted script languages do not speed up development. They slow it down. Compilation-time errors will just become runtime errors which take longer to catch if the error manifests itself towards the end of a potentially long calculation.

Script languages are appropriate for writing disposable code. Thankfully, not all code is meant to be disposable.

I have actively been writing

I have actively been writing code for well over 25 years and I have never understood the need for people to denigrate or defend a programming language or other technology. No one is forced to use any technology or platform, it is the ultimate freedom of choice in my jaded little eyes. The reality I think is that as long as what you choose gets the job done go use what you want. Given that we live in a capitalistic society, if you choose something that doesn't work then economics will play out and you will either prosper or starve. The point is that debating a programming language really doesn't matter or mean anything. Just go do what you want in whatever manner you see fit because as software developers and technologists there is a certain right to anarchy that comes with the role.

So go do python, Java,c++, ruby, objective-c, HTML, c, fortran, basic, bloom or whatever makes you happy. In the end if programming does not feel a little rogue, wild, fun, liberating and even a little nasty - maybe you should find something else to do - the point is that you can achieve all that without putting down what others choose. Rock on my fellow code warriors and play nice or as Andy once said, I think, "...make code, not war!"

Of course, it is good to

Of course, it is good to have a few tools in your toolbox. I use a dynamically typed language, these days probably Octave or Mathematica (the Home version is only $350 and the license suits my work style), when and only when the problem at hand is a quick one-off that I _know_ will not grow. For major projects, dynamic typing is a time-waster for reasons already stated in this discussion (e.g. typos) and many others.

I feel so strongly about using a language that is efficient in runtime costs as well as programmer costs that for several years I have used Ada for all of my main research work. There are so many compile-time errors that it catches that it is silly crazy how much time it saves me. I have written operator overloads to handle mixed-mode arithmetic and doing vector-style calculations is trivial and easier to read than Matlab code (no / versus ./). Ada is easy to learn (you won't need the whole thing to do scientific work unless you want it) and the compiler, called GNAT, is free and an official part of gcc. There is an active Mac Ada group (www.macada.org) and the compiler has a for-profit sponsor (www.ada-core.com). There are Ada plug-ins for Xcode and TextMate and very high-end language-smart IDEs--GPS, free from ada-core.com, and an Eclipse plug-in--and heavy support in emacs (Aquamacs and Carbon emacs and regular emacs).

If you want a quick (and I mean, _quick_) Aqua GUI, look at Pashua and CocoaDialog. I have written an Ada driver for Pashua which I should elaborate upon at some point on this site.

And I have written complete thick Ada bindings to PLplot (http://plplot.sourceforge.net/) so that Ada is a first-class citizen for this full-featured and popular technical plotting package.

Jerry