iterator question

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

iterator question

Mark Ng

I'm confused as to why the 2nd call to uniq works as expected, but not the first.
If the type of uniq is changed List[T] then it all works as expected.

// return elements that are in a, but not in b
def uniq[T](a:Iterator[T], b:Iterator[T]) =
    a.filter(!b.contains(_))
   
uniq(Iterator.range(1,6), Iterator.range(4,9)).foreach(println)

output => 12345

uniq(Iterator.range(4,9), Iterator.range(1,6)).foreach(println)

output => 678

Scala code runner version 2.7.1.final -- Copyright 2002-2008, LAMP/EPFL

thanks,
mark

Reply | Threaded
Open this post in threaded view
|

Re: iterator question

Aaron Harnly-2-2
Hi Mark,

On Jun 26, 2008, at 3:32 AM, Mark Ng wrote:

> I'm confused as to why the 2nd call to uniq works as expected, but  
> not the first.
> // return elements that are in a, but not in b
> def uniq[T](a:Iterator[T], b:Iterator[T]) = a.filter(!b.contains(_))
>
> uniq(Iterator.range(1,6), Iterator.range(4,9)).foreach(println)
> output => 12345
>
> uniq(Iterator.range(4,9), Iterator.range(1,6)).foreach(println)
> output => 678

The second produces the output you expected, but not by working in the  
way that you seem to think.
Remember that iterators are stateful. Once you've exhausted an  
iterator, it's done -- no more elements.

So when you run the repeated 'contains' tests, after iterator 'b' has  
been exhausted, the 'contains' tests will always yield false, and the  
filter predicate will yield 'true', so every number will be printed.  
This happens to produce the right answer in the second case, but not  
in the first.

Whereas when you pass in a List, the 'filter' and 'contains' methods  
call the 'elements' method of the list, which yields a *fresh*  
iterator each time.

To go in too much detail, let's step through every event of your  
second call:

1. uniq: a.filter(!b.contains(_))
        1.1. Get element from a
                - yields 4
                1.1.1 Test !b.contains(4)
                        1.1.1.1 Get element from b
                                - yields 1, which != 4, so continue 'contains'
                        1.1.1.2 Get element from b
                                - yields 2, which != 4, so continue 'contains'
                        1.1.1.3 Get element from b
                                - yields 3, which != 4, so continue 'contains'
                        1.1.1.4 Get element from b
                                - yields 4, which ==  4, so 'contains' is done
                        - 'contains' yields true
                        - ! contains yields false
        1.2 Get element from a
                - yields 5
                1.2.1 Test !b.contains(5)
                        1.2.1.1 Get element from b
                                - yields 5, which == 5, so 'contains' is done.
                        - 'contains' yields true
                        - ! contains yields false
        1.3 Get element from a
                - yields 6
                1.3.1 Test !b.contains(6)
                        1.3.1.1 Get element from b
                                - yields nothing; b is exhausted
                        - 'contains' yields false
                        - ! contains yields true
                1.3.2 Call println(6)
                        - "6" in output
        1.4 Get element from a
                - yields 7
                1.3.1 Test !b.contains(7)
                        1.3.1.1 Get element from b
                                - yields nothing; b is exhausted
                        - 'contains' yields false
                        - ! contains yields true
                1.3.2 Call println(7)
                        - "7" in output
        1.5 Get element from a
                - yields 8
                1.3.1 Test !b.contains(8)
                        1.3.1.1 Get element from b
                                - yields nothing; b is exhausted
                        - 'contains' yields false
                        - ! contains yields true
                1.3.2 Call println(8)
                        - "8" in output


I think you can now see why your first call produces the output that  
it does.

Hope that helps!
~aaron




Reply | Threaded
Open this post in threaded view
|

Re: iterator question

Mark Ng

Thank you Aaron.  That is an excellent explanation.  You're right I forgot that iterators are stateful, single-used  and can't be reseted, very unlike lists.


mark


On Thu, Jun 26, 2008 at 11:18 PM, Aaron Harnly <[hidden email]> wrote:
Hi Mark,


On Jun 26, 2008, at 3:32 AM, Mark Ng wrote:
I'm confused as to why the 2nd call to uniq works as expected, but not the first.
// return elements that are in a, but not in b
def uniq[T](a:Iterator[T], b:Iterator[T]) = a.filter(!b.contains(_))

uniq(Iterator.range(1,6), Iterator.range(4,9)).foreach(println)
output => 12345

uniq(Iterator.range(4,9), Iterator.range(1,6)).foreach(println)
output => 678

The second produces the output you expected, but not by working in the way that you seem to think.
Remember that iterators are stateful. Once you've exhausted an iterator, it's done -- no more elements.

So when you run the repeated 'contains' tests, after iterator 'b' has been exhausted, the 'contains' tests will always yield false, and the filter predicate will yield 'true', so every number will be printed. This happens to produce the right answer in the second case, but not in the first.

Whereas when you pass in a List, the 'filter' and 'contains' methods call the 'elements' method of the list, which yields a *fresh* iterator each time.

To go in too much detail, let's step through every event of your second call:

1. uniq: a.filter(!b.contains(_))
       1.1. Get element from a
               - yields 4
               1.1.1 Test !b.contains(4)
                       1.1.1.1 Get element from b
                               - yields 1, which != 4, so continue 'contains'
                       1.1.1.2 Get element from b
                               - yields 2, which != 4, so continue 'contains'
                       1.1.1.3 Get element from b
                               - yields 3, which != 4, so continue 'contains'
                       1.1.1.4 Get element from b
                               - yields 4, which ==  4, so 'contains' is done
                       - 'contains' yields true
                       - ! contains yields false
       1.2 Get element from a
               - yields 5
               1.2.1 Test !b.contains(5)
                       1.2.1.1 Get element from b
                               - yields 5, which == 5, so 'contains' is done.
                       - 'contains' yields true
                       - ! contains yields false
       1.3 Get element from a
               - yields 6
               1.3.1 Test !b.contains(6)
                       1.3.1.1 Get element from b
                               - yields nothing; b is exhausted
                       - 'contains' yields false
                       - ! contains yields true
               1.3.2 Call println(6)
                       - "6" in output
       1.4 Get element from a
               - yields 7
               1.3.1 Test !b.contains(7)
                       1.3.1.1 Get element from b
                               - yields nothing; b is exhausted
                       - 'contains' yields false
                       - ! contains yields true
               1.3.2 Call println(7)
                       - "7" in output
       1.5 Get element from a
               - yields 8
               1.3.1 Test !b.contains(8)
                       1.3.1.1 Get element from b
                               - yields nothing; b is exhausted
                       - 'contains' yields false
                       - ! contains yields true
               1.3.2 Call println(8)
                       - "8" in output


I think you can now see why your first call produces the output that it does.

Hope that helps!
~aaron