Error handling

known solutions

There is 3 trends in error handling:

raise an exception

This is the current trend. The advantage is that you don't have to worry about errors, they will not be ignored. Problems with this solution:

return a ``bad'' value

C is using this quite a lot. This can be very dangerous in an unsafe language because if an error occurs and is not checked, the program will crash (eg: uncheck malloc which fails)

ignore the error

Perl is also returing ``bad'' values, but ensure things will fail silently when using that value. Problems are just ignored. In some case it is what you want.

Pliant has an option safe:

should an Input/Output error occur on the stream, safe will prevent the raise of an exception. Instead, it will mark the file as crashed and continue. Any further attempt to write will be ignored, and attempts to read will return as much zeros as requested.
(error handling in pliant)

merd

Giving the choice to ignore or not the errors seems a good solution, at least when proposing the choice is simple, and when errors can be ignored.

In any case, merd error handling is mainly exception based. But syntactic sugar can be used to alleviate the burden of exception catching. I propose to use the or operator:

Lazy(a) or Lazy(b) := try a with _ -> b
This allow consise expressions like:
tmpdir = Sys::env{"TMPDIR"} or "/tmp"
i = Sys::argv[0].to_int or die("usage: {Sys::progname} <nb>\n")
TODO: describe cdie

links about exceptions

Default value

Let me introduce something alike perl's powerful ||, but type safe. Many types have a minimum value: Let's create a type class giving this default value:
Default_val = class
    *** := O
    ? := O -> Bool
    &&& := O,a -> a !> O ; a
    ||| := a,a -> a !> O ; a

    &&&= := Inout(O), a -> ()
    |||= := Inout(O), a -> ()
This enables some powerful construct, at least for strings and lists: The number case is more problematic as 0 is not really a minimum value. An analysis of the use of default value in perl shows that the empty string and the empty list are used a lot as default values. For numbers it is not so.

Well in fact, this Default_val needs testing to see if it would be useful.

Strings

In merd, you can use "each" to iterate on a string, but you can't index a string. Strings are also immutable.

TODO: explain why indexing is bad (the argument that indexing is costly is bad (it is also costly on lists)), explain why mutability is bad

Mapping bags

Here are examples of what one would expect to work:
expression result and type
(1, 2, 4).map(* 2) (1, 4, 8) !< List(Uint)
(1..4).map(* 2) (1, 4, 6, 8) !< List(Uint)
Node(Leaf(1), Leaf(2)).map(* 2) Node(Leaf(1), Leaf(4)) !< Tree(Uint)
Set::new(1, 2, 4).map(* 2) set is { 1, 2, 4 } !< Set::Set(Uint)
Having all this together is quite hard:

Listing directory entries and ".", ".."

(ruby-talk:35726)
> Bob Alexander wrote:
> > 
> > On the topic of whether ".." and "." should be in our directory
> > listings:
> > [...]
> > Can someone suggest an argument as to why having them in the listed
> > entries is useful?
> 
> I often use ls -al in the shell because I also need to see the 
> permissions and/or owner of '.'
> And it is useful in the shell to use '..' as a navigation tool.

be aware that most shells trick on you about '..' [1]:

% mkdir -p dir1/dir2
% ln -s dir1/dir2 link1
% cd link1
% pwd
/tmp/link1

% ls ..
dir2
% (cd .. ; ls)
dir1

% cd ..
% pwd        # should be /tmp/dir1
/tmp


Another extreme argument against the use of "..":

# An insecure chdir("..") syscall is done after removing content of a
# subdirectory in order to get back to the upper directory during recursive
# removal of directory tree.
#
# Example of 'rm -fr /tmp/a' removing '/tmp/a/b/c' directory tree:
# 
# (strace output simplified for better readability)
# 
# chdir("/tmp/a")                         = 0
# chdir("b")                              = 0
# chdir("c")                              = 0
# chdir("..")                             = 0
# rmdir("c")                              = 0
# chdir("..")                             = 0
# rmdir("b")                              = 0
# fchdir(3)                               = 0
# rmdir("/tmp/a")                         = 0
# 
# After current directory is changed to /tmp/a/b/c a race condition occurs.
# If we then move /tmp/a/b/c directory to the /tmp/c two subsequent
# chdir("..") syscalls will move to the root directory / and rm will start
# removing files from the whole file systems if it has enough privileges
# (i.e. if called by root user).


> But when I programmatically iterate over directory entries,
> I don't care about '.' or '..'

agreed. get rid of '.' and '..' !!


[1] bash, zsh and pdksh do not really use chdir("..") when asked "cd .."
    tcsh, sash do really use chdir("..") when asked "cd .."

Growing data structures

Arrays

``scripting'' languages use growing arrays that don't need pre-allocating.
array = []
array[99] = "ninety-nine"
Try this with vector in C++, and you'll get a segmentation fault or even worse. In python and java, you'll get an exception. The more sensible solution would be to catch out of bounds read access and permit writing out of bounds. The problem of course is what to do when you write sparsely, aka what is the value of array[0], array[1]... solutions are:

Dictionaries

Here is an example: ``counting the common words in a list''

Perl:
my %count;
$count{$_}++ 
    foreach @ARRAY;
Python:
l = {}
for i in array:
    try: l[i] += 1
    except: l[i] = 1
Ruby:
l = {}
array.each{|i|
  l[i] ||= 0
  l[i] += 1
}

Another example: ``multiple value per key dictionnary''

Perl:
my %ttys;
foreach (`who`) {
    my ($user, $tty) = split;
    push @{$ttys{$user}}, $tty;
}
Python:
ttys = {}
who = commands.getoutput("who")
for i in who.split("\n"):
    (user, tty) = i.split(None)[0:2]
    try: ttys[user] += [tty]
    except: ttys[user] = [tty]
Ruby:
ttys = {}
`who`.each{|e|
  (user, tty) = e.split
  ttys[user] ||= []
  ttys[user] += [tty]
}

The python solution is the simplest whereas the Perl solution can be nice but is dangerous as it also allocates on read-access, not only write-access, eg:
you access the first tty for user "foo" $ttys{"foo"}[0], but the user doesn't exist. It doesn't give an error, it allocates an entry for "foo" with an empty list. Much later you'll assume that the list can't be empty and something will fail. Finding where the error really comes from is a hell.
The more sensible solution would be to allocate only when you write or read/write. Merd may enable this.

Exception-like:Permissive:
pb1
count = Dict::new(Strict)
array.each(e -> count{e}++ or count{e}=1)
count = {}
array.each(e -> count{e}++)
pb2
ttys = Dict::new(Strict)
popen("who").each(s ->
    user, tty = s.words[0,1]
    ttys{user} = (ttys{user} or []) + [tty]
)
ttys = {}
popen("who").each(s ->
    user, tty = s.words[0,1]
    ttys{user} += [tty]
)

The ability to overload functions based on the In/Out/Inout type of the result gives this. It also relies on the Default_val value.

Packages (Modules)

Scope and Sub-packages

Work In Progress!

(sub-packages are also called Hierarchical Modules)

Just like in Python, fully qualified identifiers can still depend on the current package we are in:
in package A::B, C::foo is preferably A::C::foo if package A::C exists, and is C::foo otherwise.

Example:

rm -rf A test.py
mkdir A
cd A
touch __init__.py
echo 'import C' > B.py
echo 'def a_b(): return C.a_c()' >> B.py
echo 'def a_c(): return "a_c"' > C.py
cd ..
echo 'import A.B' > test.py
echo 'print A.B.a_b()' >> test.py
python test.py

Some discussion about Haskell's choice on this


Pixel
This document is licensed under GFDL (GNU Free Documentation License).

Release: $Id: choices.html,v 1.17 2003/01/09 18:30:56 pixel_ Exp $