This document explains some syntactical innovations included in merd.
See here for a more detailed syntax description.

Function Calls

Here are the different syntax used for function calls:
gcd(10, 4) (gcd 10 4) gcd(10,4,r)

Facts:

In merd, i propose to use gcd(10,) as sugar for x -> gcd(10,x), instead of ML's (gcd 10).
In gcd(10, ), i call the empty parameter a hole.

Example of use: length = foldl(, 0, (+ 1))

Pros:

Cons:

This hole can be used in function declaration too:

member?(e,) =
    [] -> False
    e : _ -> True
    _ : l -> member?(e,l)

is "id(1,2)" allowed when "id" expects one argument?

With "id(x) = x", one could allow "id(1,2)" where x's value is the tuple (1,2).

Disallowing this makes higher-order programming harder:

apply(f,x) = f(x)
myfirst(a,b) = apply((x,_ -> x), (a,b))
That's why merd will allow id(1,2) and rely on type-checking to catch bad use of functions.

WYSIHIIP

WYSIHIIP = What You See Is How It Is Parsed

I invented this word to classify the cases when the parsing is misleading. It belongs to the more general idea of least surprised.


Horizontal Layout

Here are some examples

There are 2 (non-exclusive) solutions:


Indentation Based Grouping

It is also called vertical layout.
The classic example is
if (C1)
  if (C2)
    S1;
else
  S2;
which is terribly misleading because the indentation suggests that

The solution is to base the grouping on indentation.

Pros:

Cons:

proposal

merd completly generalizes the layout scheme found in haskell (python's layout is even simpler):
aaaaa
  bbb
    c
    c
aaaaa
is the same as (aaaaa ; (bbb ; (c) ; (c))) ; (aaaaa)

Choosing the operator and function names


Choice of functions name

Rules for choosing:

Choice of operators name

See Syntax Across Languages to see what other languages are using.


Operator priorities (precedence)

Instead of numbered priorities, it would be better to do it the Cecil way: define a partial-order relation on operators

Various


Association Variable Name & Type

Introduction

Do you know FORTRAN? No? Well FORTRAN didn't have explicit typing. Instead it had implicit typing based on the variable name. I, J, K, L, M and N are ints and all others are floats. Of course, this is very limitative to have a type associated with each variable name. That's why, since FORTRAN, languages have avoided this feature.

But people like that idea. The hungarian notation is based on this:

Long, long ago in the early days of DOS, Microsoft's Chief Architect Dr. Charles Simonyi introduced an identifier naming convention that adds a prefix to the identifier name to indicate the functional type of the identifier.
A big limitation of this hungarian notation is that it's only a convention, not enforced by the C compiler[7]. It also take away some readability. Perl is another case of association variable name and type. It uses the prefix $, @, %. This is quite verbose as most variables are $ prefixed. It doesn't help readability and lowers expressivity.

Proposal

Give the programmer the ability to associate a variable name with a type. It is different from a global variable. It just tells that everytime the variable is used, its type must be compatible. eg (inspired by Haskell's Prelude):
vartype c = Char

isDigit c =  c >= '0' && c <= '9'
...
primExitWith :: Int -> IO a
primExitWith c = IO (\ f s -> Hugs_ExitWith c)
will fail to typecheck because of c in primExitWith.

Another example inspired by Scheme;

vartype ".*\?" = a -> Bool
vartype ".*\!" = a -> Unit
this enforce the convention that functions of the form xxx? are predicates and xxx! are mutators.

A good scope for this association is the module. Exporting this association seems a nice feature to ensure a global behaviour.

Pros:

Cons:

Open-ended lists

animals = [
    "cat",
    "dog",
]
is not a valid Haskell code because of the last comma. This is very annoying because the last line must be treated differently.
(OCaml, Python, Ruby, Perl, C... are ok)

But beware, it also means than

f(foo,
  bar,)
f(foo,
  bar,
)
are not the same. The first introduces a hole, but not the second one.

One element tuple

Why is 1-uple needed?

In languages allowing computing tuples (eg: (1,2) + (3,4) => (1,2,3,4)), it is necessary to have 1 element tuples. Otherwise you have to allow:
 (1,2) + 3 => (1,2,3)
which is no good for catching errors (at compile-time for merd, at run-time for python)

The ability to compute tuples is very important to handle things like the compile-time typed printf, or things alike macro-processing.

The 1-uple syntax issue

merd uses the comma to construct tuples. Alas this doesn't handle 0-uple and 1-uple.

Recursivity


Notes

[1]

[2]

[3] And eta expansion is preserved:

But note that evaluation time is kind of weird is merd. Partial evaluation is used...

[4] Even worse return (1-2).abs is parsed return((1-2).abs) which show that return is parsed differently even if it has a functional syntax just like Math.sqrt. return has a lower precedence.

another non-WYSIHIIP ruby example: p (1..10).to_a parsed as (p(1..10)).to_a.

example of why raising method priority would fail is sin(0.7).to_i

"ruby -w" catches most of this problem, so use it!

[5] You can't even use the fact that && has precedence over || or you get
``warning: suggest parentheses around && within ||''

[6] experimentation is needed to know if this rule could work for more than one space, eg:
1 + 2  *  3 parsed as (1+2)*3

[7] Associating a type with a variable is not easy, especially in C where coercions are everyday life. I don't think it would be possible to enforce the association without loosing a lot of expressivity.

Some more info about the hungarian notation.


Pixel
This document is licensed under GFDL (GNU Free Documentation License).

Release: $Id: choices_syntax.html,v 1.22 2002/09/23 11:25:54 pixel_ Exp $