A Monad for on-demand file generation?

View: New views
20 Messages — Rating Filter:   Alert me  

A Monad for on-demand file generation?

by Joachim Breitner-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

for an application such as a image gallery generator, that works on a
bunch of input files (that are assumed to be constant during one run of
the program) and generates or updates a bunch of output files, I often
had the problem of manually tracking what input files a certain output
file depends on, to check the timestamps if it is necessary to re-create
the file.

I thought a while how to do this with a monad that does the bookkeeping
for me. Assuming it’s called ODIO (On demand IO), I’d like a piece of
code like this:

do file1 <- readFileOD "someInput"
   file2 <- readFileOD "someOtherInput"
   writeFileOD "someOutput" (someComplexFunction file1 file2)

only actually read "someInput" and "someOtherInput", do the calculation
and write the output if these have newer time stamps than the output.

The problem I stumbled over was that considering the type of >>=
 (>>=): Monad m => m a -> (a -> m b) -> m b
means that I can not „look ahead“ what files would be written without
actually reading the requested file. Of course this is not always
possible, although I expect this code to be the exception:

do file1 <- readFileOD "someInput"
   file2 <- readFileOD "someOtherInput"
   let filename = decideFileNamenameBasedOn file2
   writeFileOD filename (someComplexFunction file1 file2)

But assuming that the input does not change during one run of the
program, it should be safe to use "unsafeInterleaveIO" to only open and
read the input when used. Then, the readFileOD could put the timestamp
of the read file in a Monad-local state and the writeFileOD could, if
the output is newer then all inputs listed in the state, skip the
writing and thus the unsafeInterleaveIO’ed file reads are skipped as
well, if they were not required for deciding the flow of the program.

One nice thing is that the implementation of (>>) knows that files read
in the first action will not affect files written in the second, so in
contrast to MonadState, we can forget about them, which I hope leads to
quite good guesses as to what files are relevant for a certain
writeFileOD operation. Also, a function
  cacheResultOD :: (Read a, Show a) =>  FilePath -> a -> ODIO a
can be used to write an (expensive) intermediate result, such as the
extracted exif information from a file, to disk, so that it can be used
without actually re-reading the large image file.

Is that a sane idea?

I’m also considering to use this example for a talk about monads at the
GPN¹ next weekend.

Greetings,
Joachim

¹ http://entropia.de/wiki/GPN7

--
Joachim "nomeata" Breitner
  mail: mail@... | ICQ# 74513189 | GPG-Key: 4743206C
  JID: nomeata@... | http://www.joachim-breitner.de/
  Debian Developer: nomeata@...


_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

signature.asc (204 bytes) Download Attachment

Re: A Monad for on-demand file generation?

by Luke Palmer-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

2008/6/30 Joachim Breitner <mail@...>:
> The problem I stumbled over was that considering the type of >>=
>  (>>=): Monad m => m a -> (a -> m b) -> m b
> means that I can not „look ahead" what files would be written without
> actually reading the requested file. Of course this is not always
> possible, although I expect this code to be the exception:

I am somewhat unclear about what you are asking.  My first impression
though is that if you're running into trouble with "looking ahead",
then this algebra is probably not a Monad.  In fact, these use cases
indicate an applicative functor to me (Control.Applicative).  Of
course, the problem with applicative functors is that the syntax that
goes with them is not as imperative as monad syntax, and syntax is a
big motivator for finding monads.

Luke
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: A Monad for on-demand file generation?

by Derek Elkins :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon, 2008-06-30 at 12:04 +0200, Joachim Breitner wrote:

> Hi,
>
> for an application such as a image gallery generator, that works on a
> bunch of input files (that are assumed to be constant during one run of
> the program) and generates or updates a bunch of output files, I often
> had the problem of manually tracking what input files a certain output
> file depends on, to check the timestamps if it is necessary to re-create
> the file.
>
> I thought a while how to do this with a monad that does the bookkeeping
> for me. Assuming it’s called ODIO (On demand IO), I’d like a piece of
> code like this:
>
> do file1 <- readFileOD "someInput"
>    file2 <- readFileOD "someOtherInput"
>    writeFileOD "someOutput" (someComplexFunction file1 file2)
>
> only actually read "someInput" and "someOtherInput", do the calculation
> and write the output if these have newer time stamps than the output.
>
> The problem I stumbled over was that considering the type of >>=
>  (>>=): Monad m => m a -> (a -> m b) -> m b
> means that I can not „look ahead“ what files would be written without
> actually reading the requested file. Of course this is not always
> possible, although I expect this code to be the exception:
> 
> do file1 <- readFileOD "someInput"
>    file2 <- readFileOD "someOtherInput"
>    let filename = decideFileNamenameBasedOn file2
>    writeFileOD filename (someComplexFunction file1 file2)
>
> But assuming that the input does not change during one run of the
> program, it should be safe to use "unsafeInterleaveIO" to only open and
> read the input when used. Then, the readFileOD could put the timestamp
> of the read file in a Monad-local state and the writeFileOD could, if
> the output is newer then all inputs listed in the state, skip the
> writing and thus the unsafeInterleaveIO’ed file reads are skipped as
> well, if they were not required for deciding the flow of the program.
>
> One nice thing is that the implementation of (>>) knows that files read
> in the first action will not affect files written in the second, so in
> contrast to MonadState, we can forget about them, which I hope leads to
> quite good guesses as to what files are relevant for a certain
> writeFileOD operation. Also, a function
>   cacheResultOD :: (Read a, Show a) =>  FilePath -> a -> ODIO a
> can be used to write an (expensive) intermediate result, such as the
> extracted exif information from a file, to disk, so that it can be used
> without actually re-reading the large image file.
>
> Is that a sane idea?
>
> I’m also considering to use this example for a talk about monads at the
> GPN¹ next weekend.

You may want to look at Magnus Carlsson's "Monads for Incremental
Computing" http://citeseer.comp.nus.edu.sg/619122.html

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: A Monad for on-demand file generation?

by Joachim Breitner-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

Am Montag, den 30.06.2008, 07:08 -0500 schrieb Derek Elkins:
> You may want to look at Magnus Carlsson's "Monads for Incremental
> Computing" http://citeseer.comp.nus.edu.sg/619122.html

not exactly what I need, but very interesting read. Maybe I can use some
of the ideas.

Thanks,
Joachim

--
Joachim "nomeata" Breitner
  mail: mail@... | ICQ# 74513189 | GPG-Key: 4743206C
  JID: nomeata@... | http://www.joachim-breitner.de/
  Debian Developer: nomeata@...



_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

signature.asc (204 bytes) Download Attachment

Re: A Monad for on-demand file generation?

by Ryan Ingram :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Some comments:

1) unsafeInterleaveIO seems like a big hammer to use for this problem,
and there are a lot of gotchas involved that you may not have fully
thought out.  But you do meet the main criteria (file being read is
assumed to be constant for a single run of the program).

If you have the ability to store metadata about the computation along
with the computation results, maybe that would be a better solution?

2) I agree with Luke that this "smells" more like an applicative
functor.  But getting to monad syntax is quite nice if you can do so.
As an applicative functor you would have "writeFileOD :: Filename ->
ODIO ByteString -> ODIO ()"; then writeFile can handle all the
necessary figuring out of timestamps itself, and you get the bonus
guarantee that the contents of the files read by the "ODIO ByteString"
argument won't affect the filename you are going to output to.

3) Instead of (Read,Show), look into Data.Binary instead, if you
actually care about efficiency.  Parsing text at read time will almost
never be faster than just performing the computation on the source
data again.

  -- ryan

On 6/30/08, Joachim Breitner <mail@...> wrote:

> Hi,
>
> for an application such as a image gallery generator, that works on a
> bunch of input files (that are assumed to be constant during one run of
> the program) and generates or updates a bunch of output files, I often
> had the problem of manually tracking what input files a certain output
> file depends on, to check the timestamps if it is necessary to re-create
> the file.
>
> I thought a while how to do this with a monad that does the bookkeeping
> for me. Assuming it's called ODIO (On demand IO), I'd like a piece of
> code like this:
>
> do file1 <- readFileOD "someInput"
>   file2 <- readFileOD "someOtherInput"
>   writeFileOD "someOutput" (someComplexFunction file1 file2)
>
> only actually read "someInput" and "someOtherInput", do the calculation
> and write the output if these have newer time stamps than the output.
>
> The problem I stumbled over was that considering the type of >>=
>  (>>=): Monad m => m a -> (a -> m b) -> m b
> means that I can not „look ahead" what files would be written without
> actually reading the requested file. Of course this is not always
> possible, although I expect this code to be the exception:
> 
> do file1 <- readFileOD "someInput"
>   file2 <- readFileOD "someOtherInput"
>   let filename = decideFileNamenameBasedOn file2
>   writeFileOD filename (someComplexFunction file1 file2)
>
> But assuming that the input does not change during one run of the
> program, it should be safe to use "unsafeInterleaveIO" to only open and
> read the input when used. Then, the readFileOD could put the timestamp
> of the read file in a Monad-local state and the writeFileOD could, if
> the output is newer then all inputs listed in the state, skip the
> writing and thus the unsafeInterleaveIO'ed file reads are skipped as
> well, if they were not required for deciding the flow of the program.
>
> One nice thing is that the implementation of (>>) knows that files read
> in the first action will not affect files written in the second, so in
> contrast to MonadState, we can forget about them, which I hope leads to
> quite good guesses as to what files are relevant for a certain
> writeFileOD operation. Also, a function
>  cacheResultOD :: (Read a, Show a) =>  FilePath -> a -> ODIO a
> can be used to write an (expensive) intermediate result, such as the
> extracted exif information from a file, to disk, so that it can be used
> without actually re-reading the large image file.
>
> Is that a sane idea?
>
> I'm also considering to use this example for a talk about monads at the
> GPN¹ next weekend.

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: A Monad for on-demand file generation?

by Joachim Breitner-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

thanks for your comments.

Am Montag, den 30.06.2008, 16:54 -0700 schrieb Ryan Ingram:
> 1) unsafeInterleaveIO seems like a big hammer to use for this problem,
> and there are a lot of gotchas involved that you may not have fully
> thought out.  But you do meet the main criteria (file being read is
> assumed to be constant for a single run of the program).

Any other gotcha? Anyways, is this really worse than the similary lazy
readFile? Using that would not safe the call to open, but at least the
reading and processing, in the same situations.

> If you have the ability to store metadata about the computation along
> with the computation results, maybe that would be a better solution?

Not sure what you mean here, sorry. Can you elaborate?

> 2) I agree with Luke that this "smells" more like an applicative
> functor.  But getting to monad syntax is quite nice if you can do so.
> As an applicative functor you would have "writeFileOD :: Filename ->
> ODIO ByteString -> ODIO ()"; then writeFile can handle all the
> necessary figuring out of timestamps itself, and you get the bonus
> guarantee that the contents of the files read by the "ODIO ByteString"
> argument won't affect the filename you are going to output to.

I thought about this (without having the applicative abstraction in
mind). This would then look like:

main = do
  f1 <- readFileOD "infile1"
  f2 <- readFileOD "infile2"
  writeFileOD "outfile1" $ someFunc <$> f1 <*> f2
  writeFileOD "outfile2" $ someOtherFunc <$> f1

right? Will it still work so that if both outfiles need to be generated,
f1 is read only once?

> 3) Instead of (Read,Show), look into Data.Binary instead, if you
> actually care about efficiency.  Parsing text at read time will almost
> never be faster than just performing the computation on the source
> data again.

I assume it’s still faster than, e.g., running an external program to
read the exif tags, but you are right, Data.Binary is nicer for this.


Thanks,
Joachim
--
Joachim "nomeata" Breitner
  mail: mail@... | ICQ# 74513189 | GPG-Key: 4743206C
  JID: nomeata@... | http://www.joachim-breitner.de/
  Debian Developer: nomeata@...


_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

signature.asc (204 bytes) Download Attachment

Re: A Monad for on-demand file generation?

by Ketil Malde-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Joachim Breitner <mail@...> writes:

>> 1) unsafeInterleaveIO seems like a big hammer to use for this problem,
>> and there are a lot of gotchas involved that you may not have fully
>> thought out.  But you do meet the main criteria (file being read is
>> assumed to be constant for a single run of the program).

> Any other gotcha?

The one that springs to mind is that you might run out of file
handles. At least on Linux, that's a precious resource.

-k
--
If I haven't seen further, it is by standing in the footprints of giants
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: A Monad for on-demand file generation?

by Joachim Breitner-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

Am Dienstag, den 01.07.2008, 11:53 +0200 schrieb Ketil Malde:

> Joachim Breitner <mail@...> writes:
>
> >> 1) unsafeInterleaveIO seems like a big hammer to use for this problem,
> >> and there are a lot of gotchas involved that you may not have fully
> >> thought out.  But you do meet the main criteria (file being read is
> >> assumed to be constant for a single run of the program).
>
> > Any other gotcha?
>
> The one that springs to mind is that you might run out of file
> handles. At least on Linux, that's a precious resource.
but at least then, (unsafeInterleaveIO readFile) is actually better than
(readFile), because if I consume the files in sequence and complete,
they will be opened and closed in sequence with the first one, but be
opened all at once with the second. At least it won’t be worse, because
the file will not be closed later, and possibly opened later.

Greetings,
Joachim


--
Joachim "nomeata" Breitner
  mail: mail@... | ICQ# 74513189 | GPG-Key: 4743206C
  JID: nomeata@... | http://www.joachim-breitner.de/
  Debian Developer: nomeata@...


_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

signature.asc (204 bytes) Download Attachment

Re: A Monad for on-demand file generation?

by David Roundy-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Jul 01, 2008 at 10:22:35AM +0000, Joachim Breitner wrote:

> Hi,
>
> Am Dienstag, den 01.07.2008, 11:53 +0200 schrieb Ketil Malde:
> > Joachim Breitner <mail@...> writes:
> >
> > >> 1) unsafeInterleaveIO seems like a big hammer to use for this problem,
> > >> and there are a lot of gotchas involved that you may not have fully
> > >> thought out.  But you do meet the main criteria (file being read is
> > >> assumed to be constant for a single run of the program).
> >
> > > Any other gotcha?
> >
> > The one that springs to mind is that you might run out of file
> > handles. At least on Linux, that's a precious resource.
>
> but at least then, (unsafeInterleaveIO readFile) is actually better than
> (readFile), because if I consume the files in sequence and complete,
> they will be opened and closed in sequence with the first one, but be
> opened all at once with the second. At least it won’t be worse, because
> the file will not be closed later, and possibly opened later.
Indeed, the best option (in my opinion) would be

unsafeInterleaveIO readFileStrict

(where you might need to write readFileStrict).  In darcs, we use lazy IO a
lot, but never lazily read a file, precisely due to the open file handle
issue.  This works pretty well, and your scenario is precisely the one in
which unsafeInterleaveIO shines.

David


_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

signature.asc (196 bytes) Download Attachment

Re: A Monad for on-demand file generation?

by Henning Thielemann :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Tue, 1 Jul 2008, David Roundy wrote:

> Indeed, the best option (in my opinion) would be
>
> unsafeInterleaveIO readFileStrict


How about ByteString.readFile ? This is strict and efficient.

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: A Monad for on-demand file generation?

by Ryan Ingram :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 7/1/08, Joachim Breitner <mail@...> wrote:

> Hi,
>
> thanks for your comments.
>
> Am Montag, den 30.06.2008, 16:54 -0700 schrieb Ryan Ingram:
> > 1) unsafeInterleaveIO seems like a big hammer to use for this problem,
> > and there are a lot of gotchas involved that you may not have fully
> > thought out.  But you do meet the main criteria (file being read is
> > assumed to be constant for a single run of the program).
>
> Any other gotcha? Anyways, is this really worse than the similary lazy
> readFile? Using that would not safe the call to open, but at least the
> reading and processing, in the same situations.
Well, you're also (from your description) probably writing some
tracking information to an IORef of some sort.  That can happen in the
middle of an otherwise pure computation, and it's difficult to know
exactly when it'll get triggered, due to laziness.  You can probably
make it work :)

> > If you have the ability to store metadata about the computation along
> > with the computation results, maybe that would be a better solution?
>
> Not sure what you mean here, sorry. Can you elaborate?

Well, while doing the computation the first time, you can track what
depends on what.  Then you save *that* information out.  Here's an
example:

main = runODIO $ do
    do
        bar <- readFileOD "bar.txt"
        baz <- readFileOD "baz.txt"
        let result = expensiveComputation bar baz
        writeFileOD "foo.bin" result

    do
        hat <- readFileOD "hat.txt"
        let result = otherComputation hat
        writeFileOD "foo2.bin" result

Now, as you mentioned before, you know that the RHS of >> doesn't
depend on the files read on the LHS.  So the two "do" blocks here are
independent.  Now, if you run with no information, you run the whole
computation, and you write out in your metadata "First we are going to
build foo.bin from bar.txt and baz.txt, and then we build foo2.bin
from hat.txt".  Now when you get to the first "do" block, you know
what computation is about to happen (since you've recorded it before),
and can check the timestamps of foo.bin, bar.txt, and baz.txt, and
potentially skip the whole thing.

Of course now the metadata depends on the script itself, but you
already had to deal with that problem :)

> > 2) I agree with Luke that this "smells" more like an applicative
> > functor.  But getting to monad syntax is quite nice if you can do so.
> > As an applicative functor you would have "writeFileOD :: Filename ->
> > ODIO ByteString -> ODIO ()"; then writeFile can handle all the
> > necessary figuring out of timestamps itself, and you get the bonus
> > guarantee that the contents of the files read by the "ODIO ByteString"
> > argument won't affect the filename you are going to output to.
>
> I thought about this (without having the applicative abstraction in
> mind). This would then look like:
>
> main = do
>  f1 <- readFileOD "infile1"
>   f2 <- readFileOD "infile2"
>  writeFileOD "outfile1" $ someFunc <$> f1 <*> f2
>  writeFileOD "outfile2" $ someOtherFunc <$> f1
>
> right?
Not exactly.  Try this:

   writeFileOD "outfile1" (someFunc <$> readFileOD "infile1" <*>
readFileOD "infile2")
   writeFileOD "outfile2" (someOtherFunc <$> readFIleOD "infile1")

(or, equivalently, replace the "<-" with "let .. in" in your data).

> Will it still work so that if both outfiles need to be generated,
> f1 is read only once?

That depends how you write it!  Remember that you can write your
applicative functor to just build up a graph of what computation might
need to be done.  You can then analyze that graph and look for sharing
if necessary.

If you want the sharing to be explicit, you need something a bit more
monad-ish.  If the type of "readFileOD" is "Filename -> ODIO (ODIO
ByteString)" then your original syntax works and gives you a chance to
pick up on the explicit sharing by labelling the result of "f1 <-
...".

  -- ryan

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: A Monad for on-demand file generation?

by Joachim Breitner-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

thanks again for you input. Just one small remark:

Am Dienstag, den 01.07.2008, 14:52 -0700 schrieb Ryan Ingram:

> On 7/1/08, Joachim Breitner <mail@...> wrote:
> > Am Montag, den 30.06.2008, 16:54 -0700 schrieb Ryan Ingram:
> > > 1) unsafeInterleaveIO seems like a big hammer to use for this problem,
> > > and there are a lot of gotchas involved that you may not have fully
> > > thought out.  But you do meet the main criteria (file being read is
> > > assumed to be constant for a single run of the program).
> >
> > Any other gotcha? Anyways, is this really worse than the similary lazy
> > readFile? Using that would not safe the call to open, but at least the
> > reading and processing, in the same situations.
>
> Well, you're also (from your description) probably writing some
> tracking information to an IORef of some sort.  That can happen in the
> middle of an otherwise pure computation, and it's difficult to know
> exactly when it'll get triggered, due to laziness.  You can probably
> make it work :)
Well, for the tracking information, I can do it purely, by copying code
from StateT (or WriterT or ReaderT, I’m not sure :-)), and adapting
slightly (e.g. the (>>) optimization). So besides unsafeInterleaveIO, no
“bad, unpure stuff” should be necessary.

I think I’ll put my ideas to code soon and post it here.

Greetings,
Joachim
--
Joachim Breitner
  e-Mail: mail@...
  Homepage: http://www.joachim-breitner.de
  ICQ#: 74513189
  Jabber-ID: nomeata@...


_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

signature.asc (204 bytes) Download Attachment

Re: A Monad for on-demand file generation?

by Brandon S. Allbery KF8NH :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On 2008 Jul 1, at 17:52, Ryan Ingram wrote:

> Well, you're also (from your description) probably writing some
> tracking information to an IORef of some sort.  That can happen in the
> middle of an otherwise pure computation, and it's difficult to know
> exactly when it'll get triggered, due to laziness.  You can probably
> make it work :)
>
>>> If you have the ability to store metadata about the computation  
>>> along
>>> with the computation results, maybe that would be a better solution?
>>
>> Not sure what you mean here, sorry. Can you elaborate?
>
> Well, while doing the computation the first time, you can track what
> depends on what.  Then you save *that* information out.  Here's an

This sounds suspiciously like Writer to me.

--
brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@...
system administrator [openafs,heimdal,too many hats] allbery@...
electrical and computer engineering, carnegie mellon university    KF8NH


_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: A Monad for on-demand file generation?

by ChrisK-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> hen, the readFileOD could put the timestamp
> of the read file in a Monad-local state and the writeFileOD could, if
> the output is newer then all inputs listed in the state, skip the
> writing and thus the unsafeInterleaveIO’ed file reads are skipped as
> well, if they were not required for deciding the flow of the program.

How is your system similar to make/Makefile or different to
make/Makefile ?

Are your actions more restricted?  Are the semantics more imperative?  Are the
dependencies still explicit or are them implicit and inferred?

--
Chris

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: Re: A Monad for on-demand file generation?

by Joachim Breitner-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

Am Mittwoch, den 02.07.2008, 16:43 +0100 schrieb ChrisK:

> > hen, the readFileOD could put the timestamp
> > of the read file in a Monad-local state and the writeFileOD could, if
> > the output is newer then all inputs listed in the state, skip the
> > writing and thus the unsafeInterleaveIO’ed file reads are skipped as
> > well, if they were not required for deciding the flow of the program.
>
> How is your system similar to make/Makefile or different to
> make/Makefile ?
>
> Are your actions more restricted?  Are the semantics more imperative?  Are the
> dependencies still explicit or are them implicit and inferred?
I think the biggest difference is that with Make, you have to explicitly
list all dependencies, which is what I want to avoid by having the Monad
keep record of the used files. So it’s mostly a convenience thingy,
altough a monad would be generally more flexible, e.g. deciding the
output file name based on some content of the some of the input files.

I have some code that I’ll put somewhere soon.

Greetings,
Joachim

--
Joachim Breitner
  e-Mail: mail@...
  Homepage: http://www.joachim-breitner.de
  ICQ#: 74513189
  Jabber-ID: nomeata@...


_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

signature.asc (204 bytes) Download Attachment

Re: Re: A Monad for on-demand file generation?

by Joachim Breitner-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

Am Donnerstag, den 03.07.2008, 15:55 +0200 schrieb Joachim Breitner:
> I have some code that I’ll put somewhere soon.

http://darcs.nomeata.de/odio/ODIO.hs now contains a simple
implementation of the idea, together with more explanation. To show what
the effect is, I wrote a very small program:

1> main = runODIO $ do
2> c1 <- readFileOD' "inFile1"
3> c2 <- readFileOD' "inFile2"
4> c3 <- readFileOD' "inFile3"
5> liftIO $ putStrLn "Some output"
6> writeFileOD' "outFile1" (show (length c1 + length c2))
7> c4 <- readFileOD' "inFile4"
8> writeFileOD' "outFile2" (show (length c1 + length c3 + length c4))
9> time <- liftIO $ getClockTime
A> writeFileOD' "outFile3" (show time ++ c1)

and a script that runs this under various conditions
http://darcs.nomeata.de/odio/demo.sh with the output available at
http://darcs.nomeata.de/odio/demo.out. Note that the primes after the
function calls are just for the verbose variant for demonstration.

Some points to emphasize (you can verify them in the demo output).

 * The 9th line runs an arbitary IO action, so from then on, ODIO can’t
do anything else but to actually write out every file it should.

 * The 5th line does not have this effect. Because this gets desugared
to (>>), the special implementation of (>>) means that the next line
still sees the same dependency state as the before the call to liftIO.

 * A change to inFile3 causes outFile1 to be re-written, although from
looking at the code, _we_ know that this is not necessary, but the ODIO
monad can not tell. The programmer should have swapped the lines.

 * A change only to inFile4 means that outFile1 will not have to
generated, and thanks to lazyness and unsafeInterleaveIO, inFile2 will
not even opened.


Some additions that might be necessary for real world use:
 * ByteString interface
 * a variant of readFileOD with type “FilePath -> IO a -> ODIO a”
if, instead of reading the file directly, you want to call some external
parsing helper (e.g. to read exif data).
 * A even more verbose mode that tells you why exactly a write action
has to be done. This is why I keep a list of Files and Timestamps
around.


I hope this is a basis for even more discussion, and of course
http://darcs.nomeata.de/odio/ is a darcs repository, so feel free to
send patches.

Greetings,
Joachim

--
Joachim "nomeata" Breitner
  mail: mail@... | ICQ# 74513189 | GPG-Key: 4743206C
  JID: nomeata@... | http://www.joachim-breitner.de/
  Debian Developer: