Friday, August 28, 2009

syntax and style of programming languages

When picking a programming language, there are several factors to consider:
  • syntax/style
  • supported platforms
  • previous experience with the language
  • size of and activity of the community
  • language features, libraries and frameworks
  • how well it fits the problem domain
  • dynamic/static typing
  • and much more...
The style and syntax of the language has a big impact on the decision when picking a new language, more so than the overall constructs of the language. A developer is much more likely to accept a language with a similar syntax -- if you want to test that theory, ask a ruby developer to pick between smalltalk and python.

Style and syntax are important for a few reasons:
  1. efficiency of writing code
  2. how easily it can be: read, understood, and maintained
  3. effect to the structure and design of the program
Some of these choices have been made by the language designers through the syntax, and others have been made by the community that uses the language. If you want to have code that is maintainable, it has to be easily read and understood, and style/syntax has a big impact on the readability your of code.

examples

Let's take a look at some JavaScript (v1.5):
function allGreetings(people) {
  var firstName, lastName, message, person;
  var greetings = [];
  for(var i = 0; i < people.length; i++){
    person = people[i];
    name = person.firstName() + ' ' + person.lastName();
    if (person.isBirthday()) {
      greetings.push("Happy Birthday, " + name);
    } else {
      greetings.push("Have a Great Day, " + name);
    }
  };
  return greetings;
}
There are certain elements of this code that are required by the language: the "function" and "return" keywords, the curly braces, and the parenthesis on function calls. However some things -- like the camelCase names and whitespace -- are the style elements that have been generally adopted by the javascript community.

Let's take a look at the same code written in Ruby:
def all_greetings(people)
  people.map do |person|
    name = "#{person.first_name} #{person.last_name}"
    if person.birthday?
      "Happy Birthday, #{name}"
    else
      "Have a great day, #{name}"
    end
  end
end
This comparison is not intended to pick on JavaScript, but the Ruby version has some clear wins: such as the "map" method, and implicit return. Most people would agree that the Ruby syntax is "cleaner" or "easier to read" -- and a lot of that comes from the decrease in the amount of punctuation (semicolons, curly braces, and parenthesis).
Note: there a lot of different ways that we could write each of the above examples to help improve the style, but I think that each of these generally reflects the style of code typically written in each language.
There is plenty of debate about syntax and style, so let me fan the flames with my wild claims about what makes for better style...

camelCase sucks

someLanguagesLikeJavaHaveAdoptedTheCamelCaseStyle
doYouFindThisStyleOfNamingEasyToRead
The only advantage that I can see for using camelCase naming is fewer characters. Otherwise, it's generally harder to read, and there is confusion about what to do with acronymns in the name -- should they be TitleCase or ALLUpperCase? I'm looking at you, XMLHttpRequest!!!
languages_such_as_ruby_use_underscores_instead
but_fall_into_the_camel_case_trap_with RubyClassAndModuleNames
Need I say more?

use punctuation sparingly

Punctuation is a powerful and terse tool. Imagine writing a book with only a single non-alphanumeric character...
Wouldn#apostrophe#t it be tough#end of sentence# You could use the special character to start and end #quote# blocks #quote# with special meaning #ellipsis#
###or you would have to assume that the punctuation had different meanings# based on its context # which would be ambiguous# right#
If you rely too heavily on a small set of punctuation, you get Lisp:
((defmethod problem-successors ((prob binary-tree-problem) state)
 (let ((n (* 2 state)))
   (list n (+ n 1)))))
So, we clearly need a little variety in the punctuation, but if we overuse it (cough...Perl...cough) we will write code that we have almost no hope of understanding a few weeks later. How long does it take to figure out what the following code does?
#!/usr/local/bin/perl

$count = 0;
while (<stdin>) {
    @w = split;
    $count++;
    for ($i=0; $i<=$#w; $i++) {
 $s[$i] += $w[$i];
    }
}

for ($i=0; $i<=$#w; $i++) {
    print $s[$i]/$count, "\t";
}

print "\n";

language keywords

Most keywords in a language will be used a lot, and should be kept short because they will be typed so often. For example: def is better than function and var is better than local.

And if you are using a dynamic language (which you should be, by the way), it's also better if your keywords can be runnable commands that obey the same rules as the rest of the code. For example, many languages have a way to import an external library. However that is usually a compile-time feature of the language.

Let's look at importing in Python:
# here we can use "import"
import SomeLibrary

# but here we have to call "__import__"
module_name = "SomeLibrary"
module = __import__(module_name)
And in Ruby:
# because require takes a string
require "foo"

# it works when passed a variable containing a string
mod = "foo"
requrie mod

conclusion

I've only scratched the surface here, but it should give some sense for the importance that a language's syntax (and style) have on writting readable code.

There are an outrageous number of programming languages, with more popping up all the time, and with this variety of languages comes a variety of syntax and style choices. If you are picking a new language (or, heaven forbid, creating a new language), choose carefully.

1 comment:

  1. If you've ever wondered what JavaScript would look like with a cleaner sytax, take a look at CoffeeScript.

    ReplyDelete