try-except in generator methods

View: New views
9 Messages — Rating Filter:   Alert me  

try-except in generator methods

by Daniel Grunwald :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Related to BOO-24:
We must allow using "yield" inside the try part of try-except blocks.
C# allows to do this:
IEnumerable<string> ReadFile(string filename) {
   using (StreamReader r = new StreamReader()) {
     string line;
     while ((line = r.ReadLine()) != null)
        yield return line;
  }
}

When the file is completely enumerated, the StreamReader will be closed,
and when a foreach loop is aborted (with "break" or exception), it will
call Dispose() on the generated enumerator which also has the effect of
closing the StreamReader.
In Boo, it is currently not possible to get the same behavior without
writing the enumerator manually. And even if one does that, it doesn't
work correctly unless "for" calls Dispose(). And because "for" needs to
use a try-ensure block for calling Dispose(), and there's lots of code
with "yield" inside "for", it is important to support "yield" inside "try".
Otherwise using LINQ becomes a huge PITA as soon as you're using any
LINQ implementation that needs to dispose something (files, SQL
connections, ...) - basically anything except the plain LINQ-to-Objects.
We also need to ensure that all Boo.Lang code that manually consumes
enumerators (e.g. the builtin functions like zip) call Dispose(). When
possible, these should be rewritten to use foreach and yield return so
that the C# compiler takes care of forwarding the Dispose() calls.

Note that the C# 3.0 compiler implements this as follows (I think C# 2.0
did something different):
The body of MoveNext is wrapped inside a huge try-fault block:
try { /* usual method body */ } fault { this.Dispose(); }

try-fault is pseudo-C# for the IL fault handler, which works just like a
finally handler except it is only called when the block is left due to
an exception, not when it is left by a normal return statement.
(Boo supports this directly, it's try-failure)

Remember the cases in which the "ensure" code should run:
1) when control flow in the generator normally leaves the "try" block
(but not when control flow leaves the "try" block due to "yield" being
implemented as "return Yield(...)")
2) when the generator code causes an exception in the "try" block
3) when the enumeration is aborted (due to "break" or exception in the
for loop) - in this case, Dispose() is called on the enumerator.

Because a yield statement is invalid in try-except (C# allows it only in
the try block of try-finally, not in the finally handler itself, and not
in try-catch blocks), an exception (no matter whether inside the
enumerator or inside the consumer loop) will always cause the
enumeration to finish, calling all outstanding finally blocks. The
try-fault block causes case 2 and case 3 behave identically, the
Dispose() method will handle executing the finally blocks surrounding
the current yield. Note that this means that we have to update the state
index when entering and leaving try blocks, not only on the "yield" call
itself. The C# compiler handles this by having multiple kind of states,
"running" and "suspended" states. There's a global "running" state
(which is also used as finish state), and a "running" state per
try-ensure statement, so that the Dispose() method always knows which
ensure blocks must be executed in case of an exception that causes
Dispose() to be called through the fault handler.
The "suspended" states are set immediately before returning from
MoveNext so that the next call knows where to continue; Dispose() also
knows for which finally blocks must be run for "suspended" states.
This handles cases 2) and 3) correctly. Case 1) is much simpler, at the
end of try blocks the state is set to the running state of the next
outer try block and the finally code is run.
All try-finally constructs containing "yield" were now replaced by
normal code (state setting and running the finally code normally), plus
a single try-fault handler. But because that fault handler is around the
whole method, the state machine doesn't have to jump into try blocks

anymore (this was mentioned as a problem in BOO-24).

Hmm.. this sounds interesting, I think I'll try to implement it myself ;-)

Daniel



signature.asc (193 bytes) Download Attachment

Re: try-except in generator methods

by Rodrigo B. de Oliveira :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Jan 22, 2008 5:15 PM, Daniel Grunwald <daniel@...> wrote:
> Related to BOO-24:
> We must allow using "yield" inside the try part of try-except blocks.

Yes!

> ...
> Remember the cases in which the "ensure" code should run:
> ...

Very good explanation.

>
> Hmm.. this sounds interesting, I think I'll try to implement it myself ;-)
>

YEAH

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Boo Programming Language" group.
To post to this group, send email to boolang@...
To unsubscribe from this group, send email to boolang-unsubscribe@...
For more options, visit this group at http://groups.google.com/group/boolang
-~----------~----~----~----~------~----~------~--~---


Re: try-except in generator methods

by Avishay Lavie :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


How do you know if the "enumeration was aborted"? The calling code
gets back an IEnumerator and is free to call or not call MoveNext on
it any number of times. Suppose I do this:

def Generate():
  try:
    yield 42
    yield 4242
    yield 424242
  ensure:
    Dispose()

def Consume():
  for index, value in zip(range(2), Generate()):
    print "${index} => ${value}"

This will only consume only the first two yields (or even just the
first one, I'm never sure about range()...). When, if at all, will the
internal Dispose() code be called?

On Jan 22, 10:15 pm, Daniel Grunwald <dan...@...> wrote:

> Remember the cases in which the "ensure" code should run:
> 1) when control flow in the generator normally leaves the "try" block
> (but not when control flow leaves the "try" block due to "yield" being
> implemented as "return Yield(...)")
> 2) when the generator code causes an exception in the "try" block
> 3) when the enumeration is aborted (due to "break" or exception in the
> for loop) - in this case, Dispose() is called on the enumerator.

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Boo Programming Language" group.
To post to this group, send email to boolang@...
To unsubscribe from this group, send email to boolang-unsubscribe@...
For more options, visit this group at http://groups.google.com/group/boolang
-~----------~----~----~----~------~----~------~--~---


Re: try-except in generator methods

by Daniel Grunwald :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Avish wrote:
> How do you know if the "enumeration was aborted"? The calling code
> gets back an IEnumerator and is free to call or not call MoveNext on
> it any number of times.
"for" should call enumerator.Dispose() just like "foreach" does in C#.

> Suppose I do this:
>
> def Generate():
>   try:
>     yield 42
>     yield 4242
>     yield 424242
>   ensure:
>     Dispose()
>
> def Consume():
>   for index, value in zip(range(2), Generate()):
>     print "${index} => ${value}"
>
> This will only consume only the first two yields (or even just the
> first one, I'm never sure about range()...). When, if at all, will the
> internal Dispose() code be called?
Compiling your code with my locally patched Boo version and then using
Reflector on it produces this:

public static void Consume()
{
    object[] enumerables = new object[] { Builtins.range(2), Generate() };
    IEnumerator ___iterator12 = Builtins.zip(enumerables);
    try
    {
        while (___iterator12.MoveNext())
        {
            object[] ___temp13 = (object[]) ___iterator12.Current;
            object index = ___temp13[0];
            object value = ___temp13[1];
            Builtins.print(new StringBuilder().Append(index).Append(" =>
").Append(value).ToString());
        }
    }
    finally
    {
        IDisposable ___disposable14 = ___iterator12 as IDisposable;
        if (___disposable14 != null)
        {
            ___disposable14.Dispose();
            // calls ZipEnumerator.Dispose, which in turn calls
Generate.$.Dispose, which then
            // a) does nothing if the iterator has been consumed
completely and thus already ran the ensure block
            // b) runs your ensure block (if the enumeration aborted
early), calling whatever Dispose() method you are referring to
        }
    }
}

I already got generating the correct code for the Generator.$.MoveNext()
and Generator.$.Dispose() methods nearly working.

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Boo Programming Language" group.
To post to this group, send email to boolang@...
To unsubscribe from this group, send email to boolang-unsubscribe@...
For more options, visit this group at http://groups.google.com/group/boolang
-~----------~----~----~----~------~----~------~--~---


Re: try-except in generator methods

by Avishay Lavie :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Ok, so zip() and for() and everything else that uses IEnumerables
should call dispose on them, that makes sense, but it's still valid
for the "calling code" to only call MoveNext twice without disposing.
So basically, we're talking about a "best effort" kind of support,
where "as long as you use the language's builtin features, disposables
are disposed correctly"?

Also, that reflected code made me realize we really need to typfiy our
tuples and perhaps use string.Format() instead of newing up a
StringBuilder on each string interpolation :)

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Boo Programming Language" group.
To post to this group, send email to boolang@...
To unsubscribe from this group, send email to boolang-unsubscribe@...
For more options, visit this group at http://groups.google.com/group/boolang
-~----------~----~----~----~------~----~------~--~---


Re: try-except in generator methods

by Daniel Grunwald :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Avish wrote:
> Ok, so zip() and for() and everything else that uses IEnumerables
> should call dispose on them, that makes sense, but it's still valid
> for the "calling code" to only call MoveNext twice without disposing.
> So basically, we're talking about a "best effort" kind of support,
> where "as long as you use the language's builtin features, disposables
> are disposed correctly"?
>  
Yes. The "ensure" will be 'forgotten' if the enumeration is aborted
early without calling dispose.
If you don't dispose an enumerator, you can get problems just like if
you don't dispose any other disposable.

By the way, I just fixed TextReaderEnumerator.lines to dispose the
underlying TextReader.
Previously code such as:

    import System.IO
    for nr, line in enumerate(File.OpenText('test.txt')):
      print "${nr}: ${line}"
    File.Delete('test.txt')

would fail because the file was still in use until the garbage collector
finalized the StreamReader.
But there remains a problem with using TextReader like
IEnumerable<string>: a text reader can be enumerated only once.
Previously,

    import System.IO
    file = File.OpenText('test.txt')
    for nr, line in enumerate(file):
      print "${nr}: ${line}"
    for nr, line in enumerate(file):
      print "${nr}: ${line}"

would print the file only once, because the TextReader wasn't reset for
the second enumeration.
Now the code will fail with an ObjectDisposedException because the file
is closed at the end of the first loop.
I don't like this implicit TextReader->IEnumerable<string> conversion,
for easy file reading I would prefer a method like this:

    def readFile(filename as string):
      using reader = File.OpenText(filename):
        while (line = reader.ReadLine()) is not null:
          yield line

That would cause the file to be re-opened whenever GetEnumerator() is
called on the returned generator.

> Also, that reflected code made me realize we really need to typfiy our
> tuples and perhaps use string.Format() instead of newing up a
> StringBuilder on each string interpolation :)
>  
Typify tuples: yes!

string.Format: no, string.Format parses the format string and then calls
StringBuilder - so the current implementation is more efficient.
Since the number of items is known at compile time, even better would be
to simply use a single string.Concat call.

Daniel



signature.asc (193 bytes) Download Attachment

Re: try-except in generator methods

by Cedric Vivier :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Jan 26, 2008 4:38 PM, Avish <some.avish@...> wrote:
Also, that reflected code made me realize we really need to typfiy our
tuples

+1

 
and perhaps use string.Format() instead of newing up a
StringBuilder on each string interpolation :)

No, SB-based interpolation is ~+20% faster on both Mono and MS.NET than string.Format with about the same memory usage (or at most ~+5% for SB).
However it would be nice to optimize further in-loop interpolations by instantiating the sb object out of the loop and then resetting it at each cycle (then we'd have again ~+8% perf and the memory usage same-same between SF and SB)   :p

 

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Boo Programming Language" group.
To post to this group, send email to boolang@...
To unsubscribe from this group, send email to boolang-unsubscribe@...
For more options, visit this group at http://groups.google.com/group/boolang
-~----------~----~----~----~------~----~------~--~---


Re: try-except in generator methods

by Cedric Vivier :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Jan 26, 2008 5:58 PM, Daniel Grunwald <daniel@...> wrote:
string.Format: no, string.Format parses the format string and then calls
StringBuilder - so the current implementation is more efficient.
Since the number of items is known at compile time, even better would be
to simply use a single string.Concat call.


Yes! We should do that when the number of item is known at compile-time :)
Let's open a JIRA bug for this.

 

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Boo Programming Language" group.
To post to this group, send email to boolang@...
To unsubscribe from this group, send email to boolang-unsubscribe@...
For more options, visit this group at http://groups.google.com/group/boolang
-~----------~----~----~----~------~----~------~--~---


Re: try-except in generator methods

by Cedric Vivier :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Jan 26, 2008 6:03 PM, Cedric Vivier <cedricv@...> wrote:
On Jan 26, 2008 5:58 PM, Daniel Grunwald <daniel@...> wrote:
string.Format: no, string.Format parses the format string and then calls
StringBuilder - so the current implementation is more efficient.
Since the number of items is known at compile time, even better would be
to simply use a single string.Concat call.


Yes! We should do that when the number of item is known at compile-time :)
Let's open a JIRA bug for this.
 
Hmm.. of course actually we always know the number of items for currently SB-based string interpolations.


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Boo Programming Language" group.
To post to this group, send email to boolang@...
To unsubscribe from this group, send email to boolang-unsubscribe@...
For more options, visit this group at http://groups.google.com/group/boolang
-~----------~----~----~----~------~----~------~--~---