Feb 03 2011

Calculating the number of weekdays between two dates

Published by Dougal under Programming

Thinking about calculating the number of working days between two dates. (Note, I haven’t actually executed any of this code, just thought about it. It could be quite wrong.)

workingdays :: Day -> Day -> Integer
workingdays start end = undefined

For this scenario I assume that end is no earlier than start. (Obviously we can deal with this situation using some wrapper function to sort the dates if the result depends on this order.)

If start and end are the same day we define this as “zero working days”.

workingdays :: Day -> Day -> Integer
workingdays start end | start == end = 0
                      | otherwise    = undefined

If start and end are both the same day of the week then there will be a whole number of “working weeks” between the dates, eg one Monday to the next Monday is one working week. A working week is 5 days, so the number of working days is the number of working weeks times five.

We determine if two dates occur on the same day of the week by checking the number of days between them. Any number which is a multiple of seven (0, 7, 14, 21, …) indicates the two dates are a whole number of weeks apart. Multiples of seven produce a remainder of zero when divided by seven, so we divide the span by seven and check the remainder.

workingdays :: Day -> Day -> Integer
workingdays start end | start == end = 0
                      | is_sameday   = 5 * wholeweeks
                      | otherwise    = undefined
    where is_sameday = diffDays start end `mod` 7 == 0
          wholeweeks = diffDays start end `div` 7

At this point we notice that the first scenario is a simplified case of the second, where the two dates occur on the same day of the week with exactly no weeks between them. We can just collapse them into one calculation.

workingdays :: Day -> Day -> Integer
workingdays start end | is_sameday = 5 * wholeweeks
                      | otherwise  = undefined
    where is_sameday = diffDays start end `mod` 7 == 0
          wholeweeks = diffDays start end `div` 7

If the two dates don’t fall on the same day there may be some fraction of a working week between the two. The number of days leftover in the last week is the remainder when the number of weeks is calculated, which we’ve already worked out once. Rather than doing it twice we’ll separate it out into whole weeks and days.

workingdays :: Day -> Day -> Integer
workingdays start end | days == 0 = 5 * weeks
                      | otherwise = 5 * weeks + partweek
    where (weeks,days) = diffDays start end `divMod` 7
          partweek     = undefined

The bit we don’t know yet is the number of working days which make up the last part-week. The value of partweek is between zero and days with an upper limit of 5 because there can only be 5 working days between any two dates less than a week apart. At this point we’d be as well to just count them.

workingdays :: Day -> Day -> Integer
workingdays start end | days == 0 = 5 * weeks
                      | otherwise = 5 * weeks + partweek
    where (weeks,days) = diffDays start end `divMod` 7
          partweek     = length $ filter weekday $ enumFromTo newstart end
          newstart     = addDays (7 * weeks + 1) start

We count the number of weekdays from the day after the first day, since that day doesn’t count. At this point it probably hasn’t escaped you that we’ve got more repetition. We could remove the first case (for whole weeks) because partweek will evaluate to zero in that case.

If there is a nicer way to calculate the remaining part of the week I haven’t spotted it. I feel sure there must be something better than just looping over the remaining days, though maybe I am wrong.

Comments Off

Feb 02 2011

Implementation-defined file format and parser: tut tut.

Published by Dougal under Programming

In the last few days I’ve been snatching some moments to rewrite the awful, ugly parser in ComicBake into something readable and adaptable. While doing this I’ve been slowly rewriting the format of the accepted scripts. All of the improvements have been fairly subtle but sometimes with large improvements in usability and ease of implementation.

The big one is the comments. Previously the first N comments were hard-coded to contain the comic title, author and other information, prefixed with the “comment delimiter”, a double-dash symbol. I’ve adopted a key-value style instead which means comments can be included (or omitted) as necessary from the header without problem:

Old style:

-- Pandemic
-- Board games
-- Dougal Stanton
-- January 2011

New style:

-- title: Pandemic
-- series: Board games
-- author: Dougal Stanton
-- date: January 2011

At the moment I’m only storing the title and author names but it’s possible to tag anything now and use it later if necessary. The tags can be re-ordered as desired and if you miss them altogether I just use an empty string.

I’m also experimenting with adding directions to indicate speech/thought bubbles. At the moment the code uses this format but it may change:

Garfield: (Thinking)
    I hate Mondays

Sadly I’ve realised that the language is still ambiguous about divisions between scenes. A scene may end when a new one starts, or when the file ends. Getting it to manage both is proving awkward. I don’t want to add more syntactic elements just to make it easier to parse, so I’ll have to see how other languages which define loosely-grouped sections (eg, INI files) are parsed.

3 responses so far

Jan 27 2011

Pandemic Victory

Published by Dougal under Comics

comicstrip

3 responses so far

Jan 26 2011

Both hands on the ground, both feet in the air?

Published by Dougal under Culture, Hobbies

For about six months now I (and then Helen) have been going to Capoeira Angola classes run by Mão no Chão group. Suffice it to say we are neither of the people in this photograph in fact, it’s just one I found on Flickr.

Leninho esquivando do golpe de m. Goiano

Continue Reading »

2 responses so far

Jan 24 2011

Books: Incoming, outgoing and in a holding pattern

Published by Dougal under Books, Friends

Right now I’m reading The Three Musketeers by Alexandre Dumas. It’s really enjoyable so far — whimsical and witty like a 19th-century The Princess Bride (not inconceivable). I’ve got a big ol’ pile of things to get through after that. I still have a book from my birthday in June and a bunch from Christmas too. I came away from last night’s book group with two more — Pathfinders: The Golden Age of Arabic Science by Jim Al-Khalili. I’d been swithering over this one until I noticed the author. He has produced some great science television so I thought his book might be worth it. And Under Milk Wood, a play I associate strongly with my father though I’ve never heard or read it. But I’ve been quoted it a lot!

I took along Seamus Heaney’s translation of Beowulf but no-one was interested. I think a lot of people had book overload and weren’t taking new ones to read. We’re not having our next meeting until March so there will be plenty of time for people to finish the books they’ve got. Hopefully I can deplete my to-read pile slightly by then.

One response so far

Jan 23 2011

A look at Lightweight Static Capabilities (part 3)

This is the third part of my slow examination of Kiselyov and Shan’s Lightweight Static Capabilities paper. Start at the the beginning and then read part two to put this part into context.

Section 3 of the paper uses the same technique from Section 2 and adapts it to tackle a larger and more interesting problem: array bound checking. To motivate the problem they follow in the footsteps of the dependent type literature by implementing binary search with statically-checked “safe” array lookups.

To explain the approach they use we’ll start with an unsafe implementation that uses runtime bounds checking on each index.

bsearch :: (Ord a) => (a -> a -> Ordering) -> a -> Array a -> Maybe (Int,a)
bsearch cmp key arr = go 0 (length arr - 1)
  where go lo hi = if hi >= lo
                      then let midpoint = lo + ((hi - lo)`div`2)
                               curvalue = arr ! midpoint
                           in case cmp key curvalue of
                                  LT -> look lo (m - 1)
                                  EQ -> Just (m, curvalue)
                                  GT -> look (m + 1) hi
                      else Nothing

The index operation (!) checks at runtime to see if the supplied index is within the bounds defined by the array (we still have to catch an exception, which I haven’t shown here, but at least it won’t blow everything up). This check is not necessary if we can prove that the index will always be in range.

The general approach outlined in Section 2 was to brand “safe” operations and types which go together. Once a list had been branded non-empty (ie had type FullList) it could be freely passed around and later safely destructured into head and tail without worry.

In the case of lists the values were branded “safe” and the head/tail operations could only work on “safe” values. The head operation is analogous to indexing the 0th element in an array. If we were to extend the list approach to arrays we would need a deref0 for the 0th element, deref1 for the 1st element and so on. This is both unwieldy and needlessly restricts the type of indexing possible.

The solution proposed by the paper is a set of “safe” indexing operations which only work when both the array and the index are branded as being compatible. An index is compatible with an array if it is within that array’s bounds.

get :: BrandedArray b a -> BrandedInt b -> a

The type variable b here is a phantom type which brands arrays and indices. The get operation can only be used if the same brand is applied to both array and the index.

Initially the only “safe” array indices are the offsets of the upper and lower element. The brands for an array are created alongside the upper and lower index in one operation, providing the assurance to the type system that whatever type b represents it is identical for these three values:

brand :: Array -> (BrandedArray b a, BrandedInt b, BrandedInt b)

(In fact this is not the proper type signature. The proper signature is reproduced below, with existential quantification over the branding type b and continuations for passing/failing conditions. This signature is just a gentle introduction to get the idea across)

Further operations can be created with transformations on the provided indices: increments, decrements and midpoints.

Whatever b is here it’s the same type, but next time brand is called b may well be a different type. The type system can’t unify them since it doesn’t know the type — it’s never specified. So only indices created alongside the branded array can be used with it, since only in that situation can the type checker be sure that the type of b matches.

This seems a powerful use of phantom types and it’s one I’d never seen before. The paper uses the phrase “type eigenvariable” at this point, which I’m not familiar with but it’s such a nice phrase that I’ll have to keep an eye out for it.

As before the type constructors to enforce this safety are hidden and only accessible under specific and well-guarded conditions.

-- Not exported!
newtype BrandedArray b a = BrandArray a
newtype BrandedInt   b i = BrandInt i
 
brand :: Array -> (forall b. (BrandedArray b, BrandedInt b, BrandedInt b) -> w)
               -> w -> w
brand arr cont stop = if lo <= hi
                         then cont (BrandArray, BrandInt lo, BrandInt hi)
                         else stop

The user provides functions to deal with the two cases that the bounds are valid and invalid. This same pattern follows through to other parts of the implementation: condition checking is done internally and the user provides functions to deal with various cases. This keeps some of the constraints (like the existential quantification over b, the brand type) wrapped up. If the user tries to subvert this the type checker objects:

Inferred type is less polymorphic than expected
  Quantified type variable `b' escapes
In the second argument of `brand', namely `Just'
In the expression: brand testarray Just Nothing
In the definition of `it': it = brand testarray Just Nothing

In order to implement the rest of binary search over arrays some arithmetic functions on indices are introduced. These functions exist inside the trusted kernel so some effort is made to ensure they are correct. They also require passing/failing functions for operations that may fail (such as finding the successor of a valid index: the successor may be out of bounds):

-- If i1 and i2 are within bounds then their midpoint
-- is within bounds.
bmiddle :: BrandedInt b -> BrandedInt b -> BrandedInt b
bmiddle i1 i2 = BrandInt (i1 + (i2 - i1) `div` 2)
 
bsucc :: BrandedInt b -> BrandedInt b
                      -> (BrandedInt b -> w)
                      -> w -> w
-- If i is less than the upper bound we use continue calculation
-- with the inside else we fail with outside.
bsucc (BrandInt upper) (BrandInt i) inside outside = 
    let i' = i + 1
    in if i' <= upper then inside (BrandInt i') else outside

There is also a predecessor function which works analogously. First we need to rework the binary search for this style of programming:

bsearch :: (Ord a) => (a -> a -> Ordering) -> a -> Array a -> Maybe (Int,a)
bsearch cmp key arr = brand arr find Nothing
  where find (barr,lower,upper) = go lower upper
    where go lo hi = let m = bmiddle lo hi
                         x = unsafeGet barr m
                     in case cmp key x of
                        LT -> bpred lo m (\i -> go lo i) Nothing
                        EQ -> Just (unbrand m, x)
                        GT -> bsucc hi m (\i -> go i hi) Nothing

On each iteration of the algorithm finding the value at the midpoint is always a safe operation provided the two arguments are safe. Incrementing or decrementing the upper/lower bound might fail if the index goes out of range, but this check can be considered part of the end condition, ie the search finishes unsuccessfully anyway if the index goes off the end because it’s an indication that the value we’re looking for is not stored.

The call to unsafeGet is, in fact, safe, since the index m is guaranteed to be in range for this array. No runtime checks are required over those needed to indicate the end condition. As before this problem is always dealt with formally using a similar mechanism for the empty-list problem, using Strict and Lax languages and mappings from one to the other. I’ll look at that next time.

Comments Off

Jan 22 2011

ComicBake with comic publishing capabilities!

Published by Dougal under Programming

Version 0.2 of ComicBake has just been pushed to the online repository. The changes since 0.1 were mostly internal reorganisation though there were some useful additions:

  • Rounded corners for speech bubbles
  • Use estimated box size for layout to prevent accidental overlaps and increase accuracy of inter-bubble spacing.
  • Publish to Flickr

The last one is the most important new addition. Now, once you’re happy with the look of the comic you can push it to Flickr. The client remembers logins between invocations so once you’ve authorised it to upload everything will work smoothly.

To this end I’ve split apart the function of the comicbake tool into build and publish sub-commands. Build works as before and is the default behaviour if you don’t specify a command.

To make and upload a previously scripted comic you now do this (making use of the default comicstrip.png file name):

$ comicbake build -i scripts/mycomic.script
$ comicbake publish --title="My great comic"

There are still plenty of issues to work out with this interface, and further work needed elsewhere, but ultimately I’m pretty happy with how things are working out.

Comments Off

Jan 19 2011

Dorian Gray and enjoyment of reading

Published by Dougal under Books, Reviews

I finished The Picture of Dorian Gray at least a week ago and I’ve been struggling to put into words what I thought about it ever since. It’s a pretty slim novel but it took me a few weeks to get through so it obviously wasn’t enthralling.

My main problem, I think, was that it had a plot but no story. I felt no desire to read on other than to find out how the plot resolved. The characters were bland at best, and often both hateful and boring. Dorian Gray wishes that his portrait would get older instead of him, which seems to stop him maturing at all. The book was originally much shorter, and it shows — I felt there was a lot of filler which expanded it from a short story to a novel.

I got a lot more enjoyment from the introduction which placed the book in a historical context and recounted some of the reactions to its publication. It’s strange how some books are more fun to read about than they are to read.

Incidentally, I object to the inclusion of an introduction which serves to give away large segments of the plot. So much so that they required a pre-introduction to tell you not to read the introduction until you’d read the book. Wouldn’t it just be more appropriate to put the spoilers at the end and call it something else?

Comments Off

Jan 18 2011

A look at Lightweight Static Capabilities (part 2)

This is part 2 of a series of posts looking at the paper Lightweight Static Capabilities by Kiselyov and Shan. Part 1 can be found here.

The next section of the paper is more theoretical. The authors introduce two notional languages, Strict and Lax, and their typing mechanisms. Strict has a “fancy” type system which specifies two list types (List and List+) and only the latter type can be deconstructed with head and tail functions. The List+ type can only be created by a special nonempty conversion, which cannot be applied to empty lists.

The Lax language has no such advanced typing facility and will happily accept nonempty(nil) as a legal construct. This language is more like a traditional functional language with a possibly-empty list type. In terms of typable operations Strict defines a smaller set than Lax, and every legal Strict program is a legal Lax program too. The authors define a mapping from Strict to Lax programs to allow embedding a statically safe program in Haskell, ML, etc.

This embedding is ultimately what we always attempt to do when programming. The set of things which the compiler will accept is necessarily larger than the set of things which will get us the right answer. Type systems aim to constrain the “acceptable” set slightly though there is still plenty of scope for bugs. :-) Some programming languages make it easier to write programs which are dangerously but subtly wrong. Using the full capabilities of a language like C all the time is incredibly foolhardy: not every situation demands pointer arithmetic. If we can use the type system to wall off “dangerous” code then all the better. Another example is the use of IO actions or unsafe* operations in Haskell. These are often used only where necessary in the former case, or walled off in an opaque module in the latter case. This walling-off allows us to concentrate our verification efforts to areas which can cause the most problems.

The resulting “embedded” code is the trusted kernel mentioned in the previous post. The reduction rules defined for Strict and Lax are contained in this kernel. The reduction rules for the non-empty list case are:

head (nonempty (cons E1 E2)) -> E1
tail (nonempty (cons E1 E2)) -> E2
indeed nil E1 E2             -> E1
indeed (cons E E') E1 E2     -> E2 (nonempty (cons E E'))

The first two rules define the operation of head and tail on a non-empty list. These operations aren’t defined for empty lists. The second two rules show how the non-empty type is constructed if the list is truly non-empty. Notice the similarity between indeed and Haskell’s maybe function (with arguments rearranged):

indeed :: List a -> b -> (FullList a -> b) -> b
maybe :: Maybe a -> b -> (a          -> b) -> b

The maybe function forces the user to deal with all cases in the option type by requiring functions and default values for each case. The indeed function goes one step further by not allowing any other way to take apart the list type — in order to use head and tail you have to convert the List to a FullList and only indeed can do that.

In the spirit of the maybe function for Maybe types and the either function for Either types it would seem sensible to have a list function for List types, which would take a function for each case suggested by a list. After a bit of reflection I think this is the fold, or rather maybe and either are specific folds.

Which all suggests that the toy example provided (reversing a linked list) can be safely recreated using a fold. If we look again at the safe version defined in the previous post we see two cases we have to deal with:

reverse :: List a -> List a -> List a
reverse xs acc = case full xs of
            Nothing -> acc
            Just ls -> reverse (tail ls) (cons (head ls) acc)

Stripping away all the plumbing stuff that’s going on here, we have two vital bits:

  1. The return of the accumulator when the list is no longer non-empty, ie is empty.
  2. Consing the head of the list onto the accumulator at each stage that the list is non-empty.

Step 1 is done automatically by the fold: it lets you iterate over values without being concerned about them running out. Step 2 is just the cons operation we know, with the order of arguments reversed. To actually reverse a list we supply an empty accumulator.

reverse in = foldl combiner acc in
  where combiner xs x = cons x xs
        acc = []

We know we can define this simple function as a one-liner but not all functions are as simple. The next section looks at array indexing operations. The motivating example this time is binary search over arrays — checking that all indexing operations remain in-bounds and that these constraints are ensured by the type system. By contrast with list reversal a binary search deals with several separate elements (the array, the lower and upper index) which cannot be hidden completely if we are to allow general array operations.

One response so far

Jan 17 2011

Greatest CD purchase never made

Published by Dougal under Music

Several years ago I bought some stuff in HMV and was given a free sampler CD to promote a number of artists’ new releases. This CD has turned out to be a fantastic resource and I’ve not yet mined all of the possibilities.

The disc included Death Cab for Cutie, Skin and Bell X1 — I now have at least one album by all of these artists. OK Go were also there, and they’ve since become heavily played on YouTube for their great music videos. I intend to look more into their music too.

More things to look into: Ladyfuzz, Nightmare of You, Stellastar, Josh Rouse. The lesson learned is that if HMV give you a sampler CD you really should take it. It will probably be incredibly well curated and highlight a whole bunch of interesting artists. It worked for me.

One response so far

« Prev - Next »