Strange behaviour of map function depending on first argument. Explanation = ?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Strange behaviour of map function depending on first argument. Explanation = ?

wintermute314
Dear,

when I execute

val objects = (1 to 2) map (i => new SomeObject(i))

it seems that objects is an object containing functions to create "new SomeObject(i)" because when printing out these objects twice, I get two different sets of object names.

When executing

val objects2 = List.range(1, 3) map (i => new SomeObject(i))

and then also printing out objects2 twice, I get the same object names also twice. So the "new SomeObject(i)" is computed only once.

Below is a small program and printout showing this behaviour. Could someone please enlighten me on the explanation?
(Both 2.7.4 and 2.7.5 show this behaviour)

Thanks!

Bart

========= Program ==========

package bart.stubTools


object Testing123  {
  def main(args: Array[String])   {
    val objects = (1 to 2) map (i => new SomeObject(i))

    println("=== Using (1 to 2)  ---> 2 different sets of objects")
    objects foreach println
    println
    objects foreach println

    println("\n=== Now with List.range ---> the same set is printed twice")
    val objects2 = List.range(1, 3) map (i => new SomeObject(i))

    objects2 foreach println
    println
    objects2 foreach println
  }

  class SomeObject(val i : Int) {
  }
}


============ Ouput ==========

"C:\Program Files\Java\jdk1.6.0_13\bin\java" -Didea.launcher.port=7532 "-Didea.launcher.bin.path=C:\Program Files\Jetbrains\IntelliJ IDEA 8.x\bin" -Dfile.encoding=windows-1252 -classpath "C:\Program Files\Java\jdk1.6.0_13\jre\lib\charsets.jar;C:\Program Files\Java\jdk1.6.0_13\jre\lib\deploy.jar;C:\Program Files\Java\jdk1.6.0_13\jre\lib\javaws.jar;C:\Program Files\Java\jdk1.6.0_13\jre\lib\jce.jar;C:\Program Files\Java\jdk1.6.0_13\jre\lib\jsse.jar;C:\Program Files\Java\jdk1.6.0_13\jre\lib\management-agent.jar;C:\Program Files\Java\jdk1.6.0_13\jre\lib\plugin.jar;C:\Program Files\Java\jdk1.6.0_13\jre\lib\resources.jar;C:\Program Files\Java\jdk1.6.0_13\jre\lib\rt.jar;C:\Program Files\Java\jdk1.6.0_13\jre\lib\ext\dnsns.jar;C:\Program Files\Java\jdk1.6.0_13\jre\lib\ext\localedata.jar;C:\Program Files\Java\jdk1.6.0_13\jre\lib\ext\sunjce_provider.jar;C:\Program Files\Java\jdk1.6.0_13\jre\lib\ext\sunmscapi.jar;C:\Program Files\Java\jdk1.6.0_13\jre\lib\ext\sunpkcs11.jar;C:\Users\bart\IdeaProjects\wintermute\out\production\wintermute;C:\Program Files\Scala\lib\scala-dbc.jar;C:\Program Files\Scala\lib\scala-compiler.jar;C:\Program Files\Scala\lib\scala-library.jar;C:\Program Files\Scala\lib\scala-swing.jar;C:\Program Files\Jetbrains\IntelliJ IDEA 8.x\lib\idea_rt.jar" com.intellij.rt.execution.application.AppMain bart.stubTools.Testing123 3

=== Using (1 to 2)  ---> 2 different sets of objects
bart.stubTools.Testing123$SomeObject@1cd8669
bart.stubTools.Testing123$SomeObject@337838

bart.stubTools.Testing123$SomeObject@18558d2
bart.stubTools.Testing123$SomeObject@18a47e0

=== Now with List.range ---> the same set is printed twice
bart.stubTools.Testing123$SomeObject@15eb0a9
bart.stubTools.Testing123$SomeObject@1a05308

bart.stubTools.Testing123$SomeObject@15eb0a9
bart.stubTools.Testing123$SomeObject@1a05308

Reply | Threaded
Open this post in threaded view
|

Re: [scala] Strange behaviour of map function depending on first argument. Explanation = ?

Jan Lohre
Not that this didn't came up more than once already.

(1 to 2) results in a lazy data structure (Range if I am correct), hence calling map on it results again in something lazy.
List.range(1,3) results in a strict data structure (List), hence calling map on it results again in something strict.

  try
(1 to 2).force map (i => new SomeObject(i))
 or
((1 to 2) map (i => new SomeObject(i))).force

I hope that helps.

Kind regards,
Jan

2009/6/6 wintermute314 <[hidden email]>

Dear,

when I execute

val objects = (1 to 2) map (i => new SomeObject(i))

it seems that objects is an object containing functions to create "new
SomeObject(i)" because when printing out these objects twice, I get two
different sets of object names.

When executing

val objects2 = List.range(1, 3) map (i => new SomeObject(i))

and then also printing out objects2 twice, I get the same object names also
twice. So the "new SomeObject(i)" is computed only once.

Below is a small program and printout showing this behaviour. Could someone
please enlighten me on the explanation?
(Both 2.7.4 and 2.7.5 show this behaviour)

Thanks!

Bart

========= Program ==========

package bart.stubTools


object Testing123  {
 def main(args: Array[String])   {
   val objects = (1 to 2) map (i => new SomeObject(i))

   println("=== Using (1 to 2)  ---> 2 different sets of objects")
   objects foreach println
   println
   objects foreach println

   println("\n=== Now with List.range ---> the same set is printed twice")
   val objects2 = List.range(1, 3) map (i => new SomeObject(i))

   objects2 foreach println
   println
   objects2 foreach println
 }

 class SomeObject(val i : Int) {
 }
}


============ Ouput ==========

"C:\Program Files\Java\jdk1.6.0_13\bin\java" -Didea.launcher.port=7532
"-Didea.launcher.bin.path=C:\Program Files\Jetbrains\IntelliJ IDEA 8.x\bin"
-Dfile.encoding=windows-1252 -classpath "C:\Program
Files\Java\jdk1.6.0_13\jre\lib\charsets.jar;C:\Program
Files\Java\jdk1.6.0_13\jre\lib\deploy.jar;C:\Program
Files\Java\jdk1.6.0_13\jre\lib\javaws.jar;C:\Program
Files\Java\jdk1.6.0_13\jre\lib\jce.jar;C:\Program
Files\Java\jdk1.6.0_13\jre\lib\jsse.jar;C:\Program
Files\Java\jdk1.6.0_13\jre\lib\management-agent.jar;C:\Program
Files\Java\jdk1.6.0_13\jre\lib\plugin.jar;C:\Program
Files\Java\jdk1.6.0_13\jre\lib\resources.jar;C:\Program
Files\Java\jdk1.6.0_13\jre\lib\rt.jar;C:\Program
Files\Java\jdk1.6.0_13\jre\lib\ext\dnsns.jar;C:\Program
Files\Java\jdk1.6.0_13\jre\lib\ext\localedata.jar;C:\Program
Files\Java\jdk1.6.0_13\jre\lib\ext\sunjce_provider.jar;C:\Program
Files\Java\jdk1.6.0_13\jre\lib\ext\sunmscapi.jar;C:\Program
Files\Java\jdk1.6.0_13\jre\lib\ext\sunpkcs11.jar;C:\Users\bart\IdeaProjects\wintermute\out\production\wintermute;C:\Program
Files\Scala\lib\scala-dbc.jar;C:\Program
Files\Scala\lib\scala-compiler.jar;C:\Program
Files\Scala\lib\scala-library.jar;C:\Program
Files\Scala\lib\scala-swing.jar;C:\Program Files\Jetbrains\IntelliJ IDEA
8.x\lib\idea_rt.jar" com.intellij.rt.execution.application.AppMain
bart.stubTools.Testing123 3

=== Using (1 to 2)  ---> 2 different sets of objects
bart.stubTools.Testing123$SomeObject@1cd8669
bart.stubTools.Testing123$SomeObject@337838

bart.stubTools.Testing123$SomeObject@18558d2
bart.stubTools.Testing123$SomeObject@18a47e0

=== Now with List.range ---> the same set is printed twice
bart.stubTools.Testing123$SomeObject@15eb0a9
bart.stubTools.Testing123$SomeObject@1a05308

bart.stubTools.Testing123$SomeObject@15eb0a9
bart.stubTools.Testing123$SomeObject@1a05308


--
View this message in context: http://www.nabble.com/Strange-behaviour-of-map-function-depending-on-first-argument.-Explanation-%3D---tp23902262p23902262.html
Sent from the Scala mailing list archive at Nabble.com.


Reply | Threaded
Open this post in threaded view
|

Busy evaluation (was: [scala] Strange behaviour of map function depending on first argument. Explanation = ?)

Florian Hars-3
Jan Lohre wrote:
> (1 to 2) results in a lazy data structure

No, it doesn't. A lazy data structure guarantees to evaluate its members
*at* *most* once. If you want a catchy name for the evaluation semantics
of the default non-strict constructs in scala it would be "busy", not
"lazy".

- Florian.
Reply | Threaded
Open this post in threaded view
|

Re: Busy evaluation (was: [scala] Strange behaviour of map function depending on first argument. Explanation = ?)

David MacIver
2009/6/6 Florian Hars <[hidden email]>:
> Jan Lohre wrote:
>>
>> (1 to 2) results in a lazy data structure
>
> No, it doesn't. A lazy data structure guarantees to evaluate its members
> *at* *most* once. If you want a catchy name for the evaluation semantics of
> the default non-strict constructs in scala it would be "busy", not
> "lazy".

I don't know where I'd look for a formal definition of "lazy
evaluation" that would be considered authoritative, but in a lot of
usages (and in particular wikipedia for whatever that's worth), lazy
does actually seem to get used as a catch all term for non-strict
rather than specifically call by need. So structures with call by name
evaluation like 1 to 2 could thus be called lazy.
Reply | Threaded
Open this post in threaded view
|

Re: Strange behaviour of map function depending on first argument. Explanation = ?

eishay
In reply to this post by wintermute314
The issue was discussed few weeks ago on this thread:
Seq repeated execution unexpected behavior http://www.nabble.com/Seq-repeated-execution-unexpected-behavior-td23577882.html#a23577882
Though I understand the motivation and the way it works, in my opinion such implicit behavior is dangerous. I would lobby to change it :-)

eishay
Reply | Threaded
Open this post in threaded view
|

[scala] Re: Busy evaluation

Ivan Todoroski-2
In reply to this post by Florian Hars-3
On 06.06.2009 18:17, Florian Hars wrote:
> Jan Lohre wrote:
>> (1 to 2) results in a lazy data structure
>
> No, it doesn't. A lazy data structure guarantees to evaluate its members
> *at* *most* once. If you want a catchy name for the evaluation semantics
> of the default non-strict constructs in scala it would be "busy", not
> "lazy".
>
> - Florian.

Scala actually offers all three variants.

- if you want upfront (or strict) evaluation, use List.range()
- if you want deferred evaluation at most once, use Stream.range()
- if you want deferred evaluation every time, use new Range()

Try it with the examples offered by the original poster.

I suppose an argument could be made that it would be more intuitive for
beginning users if the (1 to 2) syntax sugar was mapped to
Stream.range() or even List.range() instead of new Range(), but that
could lead to exorbitant memory use in simple for loops if the number of
iterations were large and we didn't need to collect all results.

Perhaps this should go in the Scala FAQ?
Reply | Threaded
Open this post in threaded view
|

Re: [scala] Re: Busy evaluation

wintermute314
My preference would go to mapping (1 to 3) to the Stream.range because this seems the most defensive option:
- it protects against running out of memory due to huge lists
- it executes the code only once which is what would be the intention in most cases
- people who really want the "busy" behaviour will most probably make a concious decision to want this behaviour and so will not mind making that explicit in the code.

I would also propose keep the term kazy for calculations that are performed at most once and use another term for the "busy" form. Maybe busy is not the best term. "Repeated", "reiterated", ...?

Tnx for the swift replies!

Bart


Ivan Todoroski-2 wrote
On 06.06.2009 18:17, Florian Hars wrote:
> Jan Lohre wrote:
>> (1 to 2) results in a lazy data structure
>
> No, it doesn't. A lazy data structure guarantees to evaluate its members
> *at* *most* once. If you want a catchy name for the evaluation semantics
> of the default non-strict constructs in scala it would be "busy", not
> "lazy".
>
> - Florian.

Scala actually offers all three variants.

- if you want upfront (or strict) evaluation, use List.range()
- if you want deferred evaluation at most once, use Stream.range()
- if you want deferred evaluation every time, use new Range()

Try it with the examples offered by the original poster.

I suppose an argument could be made that it would be more intuitive for
beginning users if the (1 to 2) syntax sugar was mapped to
Stream.range() or even List.range() instead of new Range(), but that
could lead to exorbitant memory use in simple for loops if the number of
iterations were large and we didn't need to collect all results.

Perhaps this should go in the Scala FAQ?
Reply | Threaded
Open this post in threaded view
|

Re: [scala] Re: Busy evaluation

Ivan Todoroski-2
On 07.06.2009 12:56, wintermute314 wrote:
> My preference would go to mapping (1 to 3) to the Stream.range because this
> seems the most defensive option:
> - it protects against running out of memory due to huge lists

That's not completely true though. To borrow the theme of your example:


var sum: BigInt = 0

val objects = Stream.range(1, 10000000) map (i => new SomeObject(i))
objects foreach {x => sum += x.i}

println(sum)


The above will throw OOM, whereas the Range version will still compute
the sum.

However, the following works fine:

Stream.range(1, 10000000) map (i => new SomeObject(i)) foreach {x => sum
+= x.i}

Same goes for the simple for-loop idiom:

for (x <- Stream.range(1, 10000000)) sum += x

The reason the first code example breaks is because the "objects" value
keeps a reference to the beginning of the stream preventing the garbage
collector from collecting any stream cells. In the second and third
examples, the stream is not assigned to a variable so the garbage
collector is free to reclaim the stream cells as they are "spent" by the
iteration.


> - it executes the code only once which is what would be the intention in
> most cases
> - people who really want the "busy" behaviour will most probably make a
> concious decision to want this behaviour and so will not mind making that
> explicit in the code.

I'm inclined to agree with you. If (1 to n) were replaced with
Stream.range(), the most common for-loop idiom will continue to work
even with large number of iterations as shown above, yet the behaviour
would be less surprising when (1 to n) is explicitly assigned to a
variable that is referenced multiple times.

And if someone explicitly needs the range semantics, they can just use
the Range class directly, or as Paul Phillips suggested in the other
thread new methods could be created (1 rangeTo n) and (1 rangeUntil n)
that produce ranges.

-- Ivan



>
> I would also propose keep the term kazy for calculations that are performed
> at most once and use another term for the "busy" form. Maybe busy is not the
> best term. "Repeated", "reiterated", ...?
>
> Tnx for the swift replies!
>
> Bart
>
>
>
> Ivan Todoroski-2 wrote:
>> On 06.06.2009 18:17, Florian Hars wrote:
>>> Jan Lohre wrote:
>>>> (1 to 2) results in a lazy data structure
>>> No, it doesn't. A lazy data structure guarantees to evaluate its members
>>> *at* *most* once. If you want a catchy name for the evaluation semantics
>>> of the default non-strict constructs in scala it would be "busy", not
>>> "lazy".
>>>
>>> - Florian.
>> Scala actually offers all three variants.
>>
>> - if you want upfront (or strict) evaluation, use List.range()
>> - if you want deferred evaluation at most once, use Stream.range()
>> - if you want deferred evaluation every time, use new Range()
>>
>> Try it with the examples offered by the original poster.
>>
>> I suppose an argument could be made that it would be more intuitive for
>> beginning users if the (1 to 2) syntax sugar was mapped to
>> Stream.range() or even List.range() instead of new Range(), but that
>> could lead to exorbitant memory use in simple for loops if the number of
>> iterations were large and we didn't need to collect all results.
>>
>> Perhaps this should go in the Scala FAQ?
>>
>>
>