Scope of redesigned collections

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
100 messages Options
12345
Reply | Threaded
Open this post in threaded view
|

Scope of redesigned collections

Martin Odersky
I have recently opened an issue inviting strawman designs for new
collection architectures:

https://github.com/lampepfl/dotty/issues/818

The issue lists some requirements for a design to be acceptable. Since
then there has been a discussion about several points that are all
outside the requirements outlined in #818. I think it's worthwhile to
discuss these, but do not think #818 is the proper place to do it.
That's why I would like to continue the discussion here.

 - Martin



--
Martin Odersky
EPFL

--
You received this message because you are subscribed to the Google Groups "scala-debate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Scope of redesigned collections

Martin Odersky
Here is a first response to

>> We gloss over the mutabile/immutable distinction

> We probably want mutable collections to not have the same API as the immutable ones. One large problem with the current mutable collections is that by inheriting all the immutable methods, it's
neigh impossible to figure out where the "useful" mutable methods are.
e.g. Buffers append/remove is what you generally care about, but
they're lost in the huge sea of immutable methods you basically never
want. The same applies to mutable.Set, mutable.Queue, etc.

> By that logic, ArrayBuffer probably shouldn't be a collection, and perhaps making it convertible to one through .view or .iterator is enough?

If we take this idea to its logical conclusion, we should abandon
`map` and friends on arrays because they are also mutable. This would
render obsolete virtually every Scala book in existence. Besides, it
would be counter-productive. Map and friends are perfectly good
methods on arrays.
So if we do not want to go there, where to draw the line?

 - Martin




On Mon, Oct 5, 2015 at 10:18 PM, martin odersky <[hidden email]> wrote:

> I have recently opened an issue inviting strawman designs for new
> collection architectures:
>
> https://github.com/lampepfl/dotty/issues/818
>
> The issue lists some requirements for a design to be acceptable. Since
> then there has been a discussion about several points that are all
> outside the requirements outlined in #818. I think it's worthwhile to
> discuss these, but do not think #818 is the proper place to do it.
> That's why I would like to continue the discussion here.
>
>  - Martin
>
>
>
> --
> Martin Odersky
> EPFL



--
Martin Odersky
EPFL

--
You received this message because you are subscribed to the Google Groups "scala-debate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Scope of redesigned collections

Haoyi Li
So if we do not want to go there, where to draw the line?

I say we draw the line between Arrays and everything else. They're already weird special things (reified generics, rubbish toString, rubbish equality, ...), so IMHO it's perfectly fine to have them be weird-special mutable objects with immutable operations. They're not even really collections in the first place!

After that, immutable.* operations on mutable.T forall T in {Seq, Set, Map, Buffer, Queue, PriorityQueue, ...}  should be all killed. 

On Mon, Oct 5, 2015 at 1:23 PM, martin odersky <[hidden email]> wrote:
Here is a first response to

>> We gloss over the mutabile/immutable distinction

> We probably want mutable collections to not have the same API as the immutable ones. One large problem with the current mutable collections is that by inheriting all the immutable methods, it's
neigh impossible to figure out where the "useful" mutable methods are.
e.g. Buffers append/remove is what you generally care about, but
they're lost in the huge sea of immutable methods you basically never
want. The same applies to mutable.Set, mutable.Queue, etc.

> By that logic, ArrayBuffer probably shouldn't be a collection, and perhaps making it convertible to one through .view or .iterator is enough?

If we take this idea to its logical conclusion, we should abandon
`map` and friends on arrays because they are also mutable. This would
render obsolete virtually every Scala book in existence. Besides, it
would be counter-productive. Map and friends are perfectly good
methods on arrays.
So if we do not want to go there, where to draw the line?

 - Martin




On Mon, Oct 5, 2015 at 10:18 PM, martin odersky <[hidden email]> wrote:
> I have recently opened an issue inviting strawman designs for new
> collection architectures:
>
> https://github.com/lampepfl/dotty/issues/818
>
> The issue lists some requirements for a design to be acceptable. Since
> then there has been a discussion about several points that are all
> outside the requirements outlined in #818. I think it's worthwhile to
> discuss these, but do not think #818 is the proper place to do it.
> That's why I would like to continue the discussion here.
>
>  - Martin
>
>
>
> --
> Martin Odersky
> EPFL



--
Martin Odersky
EPFL

--
You received this message because you are subscribed to the Google Groups "scala-debate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "scala-debate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Scope of redesigned collections

Martin Odersky
But, what about ResizableArray. It looks like an array, so why deny it
the same operations?

There's the other valid use case where you build up a mutable
collection and once it is fully constructed want to use it as if it
was immutable. Copying to make it immutable is not permitted for
performance reasons. How would be support that?

 - Martin



On Mon, Oct 5, 2015 at 10:30 PM, Haoyi Li <[hidden email]> wrote:

>> So if we do not want to go there, where to draw the line?
>
> I say we draw the line between Arrays and everything else. They're already
> weird special things (reified generics, rubbish toString, rubbish equality,
> ...), so IMHO it's perfectly fine to have them be weird-special mutable
> objects with immutable operations. They're not even really collections in
> the first place!
>
> After that, immutable.* operations on mutable.T forall T in {Seq, Set, Map,
> Buffer, Queue, PriorityQueue, ...}  should be all killed.
>
> On Mon, Oct 5, 2015 at 1:23 PM, martin odersky <[hidden email]>
> wrote:
>>
>> Here is a first response to
>>
>> >> We gloss over the mutabile/immutable distinction
>>
>> > We probably want mutable collections to not have the same API as the
>> > immutable ones. One large problem with the current mutable collections is
>> > that by inheriting all the immutable methods, it's
>> neigh impossible to figure out where the "useful" mutable methods are.
>> e.g. Buffers append/remove is what you generally care about, but
>> they're lost in the huge sea of immutable methods you basically never
>> want. The same applies to mutable.Set, mutable.Queue, etc.
>>
>> > By that logic, ArrayBuffer probably shouldn't be a collection, and
>> > perhaps making it convertible to one through .view or .iterator is enough?
>>
>> If we take this idea to its logical conclusion, we should abandon
>> `map` and friends on arrays because they are also mutable. This would
>> render obsolete virtually every Scala book in existence. Besides, it
>> would be counter-productive. Map and friends are perfectly good
>> methods on arrays.
>> So if we do not want to go there, where to draw the line?
>>
>>  - Martin
>>
>>
>>
>>
>> On Mon, Oct 5, 2015 at 10:18 PM, martin odersky <[hidden email]>
>> wrote:
>> > I have recently opened an issue inviting strawman designs for new
>> > collection architectures:
>> >
>> > https://github.com/lampepfl/dotty/issues/818
>> >
>> > The issue lists some requirements for a design to be acceptable. Since
>> > then there has been a discussion about several points that are all
>> > outside the requirements outlined in #818. I think it's worthwhile to
>> > discuss these, but do not think #818 is the proper place to do it.
>> > That's why I would like to continue the discussion here.
>> >
>> >  - Martin
>> >
>> >
>> >
>> > --
>> > Martin Odersky
>> > EPFL
>>
>>
>>
>> --
>> Martin Odersky
>> EPFL
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "scala-debate" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [hidden email].
>> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "scala-debate" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [hidden email].
> For more options, visit https://groups.google.com/d/optout.



--
Martin Odersky
EPFL

--
You received this message because you are subscribed to the Google Groups "scala-debate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Scope of redesigned collections

Haoyi Li
But, what about ResizableArray. It looks like an array, so why deny it
the same operations?

Good question. Do people use it? I count 220 occurrences on github in Scala files vs 370895 occurrences for "Seq" vs 48,532 occurrences for "ArrayBuffer", so I'd argue the added weirdness in that case won't affect too many people

> There's the other valid use case where you build up a mutable
collection and once it is fully constructed want to use it as if it
was immutable. Copying to make it immutable is not permitted for
performance reasons. How would be support that?


Isn't that what the XXXBuilder classes like VectorBuilder are for? If you want a builder you use a builder collection, not a mutable collection. Unless we're getting rid of builders?

On Mon, Oct 5, 2015 at 1:44 PM, martin odersky <[hidden email]> wrote:
But, what about ResizableArray. It looks like an array, so why deny it
the same operations?

There's the other valid use case where you build up a mutable
collection and once it is fully constructed want to use it as if it
was immutable. Copying to make it immutable is not permitted for
performance reasons. How would be support that?

 - Martin



On Mon, Oct 5, 2015 at 10:30 PM, Haoyi Li <[hidden email]> wrote:
>> So if we do not want to go there, where to draw the line?
>
> I say we draw the line between Arrays and everything else. They're already
> weird special things (reified generics, rubbish toString, rubbish equality,
> ...), so IMHO it's perfectly fine to have them be weird-special mutable
> objects with immutable operations. They're not even really collections in
> the first place!
>
> After that, immutable.* operations on mutable.T forall T in {Seq, Set, Map,
> Buffer, Queue, PriorityQueue, ...}  should be all killed.
>
> On Mon, Oct 5, 2015 at 1:23 PM, martin odersky <[hidden email]>
> wrote:
>>
>> Here is a first response to
>>
>> >> We gloss over the mutabile/immutable distinction
>>
>> > We probably want mutable collections to not have the same API as the
>> > immutable ones. One large problem with the current mutable collections is
>> > that by inheriting all the immutable methods, it's
>> neigh impossible to figure out where the "useful" mutable methods are.
>> e.g. Buffers append/remove is what you generally care about, but
>> they're lost in the huge sea of immutable methods you basically never
>> want. The same applies to mutable.Set, mutable.Queue, etc.
>>
>> > By that logic, ArrayBuffer probably shouldn't be a collection, and
>> > perhaps making it convertible to one through .view or .iterator is enough?
>>
>> If we take this idea to its logical conclusion, we should abandon
>> `map` and friends on arrays because they are also mutable. This would
>> render obsolete virtually every Scala book in existence. Besides, it
>> would be counter-productive. Map and friends are perfectly good
>> methods on arrays.
>> So if we do not want to go there, where to draw the line?
>>
>>  - Martin
>>
>>
>>
>>
>> On Mon, Oct 5, 2015 at 10:18 PM, martin odersky <[hidden email]>
>> wrote:
>> > I have recently opened an issue inviting strawman designs for new
>> > collection architectures:
>> >
>> > https://github.com/lampepfl/dotty/issues/818
>> >
>> > The issue lists some requirements for a design to be acceptable. Since
>> > then there has been a discussion about several points that are all
>> > outside the requirements outlined in #818. I think it's worthwhile to
>> > discuss these, but do not think #818 is the proper place to do it.
>> > That's why I would like to continue the discussion here.
>> >
>> >  - Martin
>> >
>> >
>> >
>> > --
>> > Martin Odersky
>> > EPFL
>>
>>
>>
>> --
>> Martin Odersky
>> EPFL
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "scala-debate" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [hidden email].
>> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "scala-debate" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [hidden email].
> For more options, visit https://groups.google.com/d/optout.



--
Martin Odersky
EPFL

--
You received this message because you are subscribed to the Google Groups "scala-debate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Scope of redesigned collections

Aleksandar Prokopec


On Monday, 5 October 2015 22:54:03 UTC+2, Haoyi Li wrote:
But, what about ResizableArray. It looks like an array, so why deny it
the same operations?

Good question. Do people use it? I count 220 occurrences on github in Scala files vs 370895 occurrences for "Seq" vs 48,532 occurrences for "ArrayBuffer", so I'd argue the added weirdness in that case won't affect too many people

> There's the other valid use case where you build up a mutable
collection and once it is fully constructed want to use it as if it
was immutable. Copying to make it immutable is not permitted for
performance reasons. How would be support that?



To play the devils advocate with what Martin said: we don't need to copy a mutable collection to make it immutable, we could just freeze it, i.e. disallow further updates.
Another possibility would be to go the Rust way, and use ownership types to ensure that iteration and mutation don't happen at the same time.
 
@Haoyi LI:
I'm still not sure what is the root cause why we would want to remove operations on mutable collections (and from the sentence "Buffers append/remove is what you generally care about, but  they're lost in the huge sea of immutable methods you basically never  want", it's not clear to me which ones you want to see removed).
Could you clarify?
 
Isn't that what the XXXBuilder classes like <a href="http://www.scala-lang.org/files/archive/nightly/docs/library/index.html#scala.collection.immutable.VectorBuilder" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fwww.scala-lang.org%2Ffiles%2Farchive%2Fnightly%2Fdocs%2Flibrary%2Findex.html%23scala.collection.immutable.VectorBuilder\46sa\75D\46sntz\0751\46usg\75AFQjCNHrAETY4RVatfSgxbuHCj-xRf_8lQ&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fwww.scala-lang.org%2Ffiles%2Farchive%2Fnightly%2Fdocs%2Flibrary%2Findex.html%23scala.collection.immutable.VectorBuilder\46sa\75D\46sntz\0751\46usg\75AFQjCNHrAETY4RVatfSgxbuHCj-xRf_8lQ&#39;;return true;">VectorBuilder are for? If you want a builder you use a builder collection, not a mutable collection. Unless we're getting rid of builders?


There is more to mutable collections than just the builder pattern in which elements are monotonically added.
Maps and sets (although not in the strawman proposal) support remove and conditional removes.


On Mon, Oct 5, 2015 at 1:44 PM, martin odersky <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="6_GMVMg8AQAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">martin....@...> wrote:
But, what about ResizableArray. It looks like an array, so why deny it
the same operations?

There's the other valid use case where you build up a mutable
collection and once it is fully constructed want to use it as if it
was immutable. Copying to make it immutable is not permitted for
performance reasons. How would be support that?

 - Martin



On Mon, Oct 5, 2015 at 10:30 PM, Haoyi Li <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="6_GMVMg8AQAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">haoy...@...> wrote:
>> So if we do not want to go there, where to draw the line?
>
> I say we draw the line between Arrays and everything else. They're already
> weird special things (reified generics, rubbish toString, rubbish equality,
> ...), so IMHO it's perfectly fine to have them be weird-special mutable
> objects with immutable operations. They're not even really collections in
> the first place!
>
> After that, immutable.* operations on mutable.T forall T in {Seq, Set, Map,
> Buffer, Queue, PriorityQueue, ...}  should be all killed.
>
> On Mon, Oct 5, 2015 at 1:23 PM, martin odersky <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="6_GMVMg8AQAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">martin....@...>
> wrote:
>>
>> Here is a first response to
>>
>> >> We gloss over the mutabile/immutable distinction
>>
>> > We probably want mutable collections to not have the same API as the
>> > immutable ones. One large problem with the current mutable collections is
>> > that by inheriting all the immutable methods, it's
>> neigh impossible to figure out where the "useful" mutable methods are.
>> e.g. Buffers append/remove is what you generally care about, but
>> they're lost in the huge sea of immutable methods you basically never
>> want. The same applies to mutable.Set, mutable.Queue, etc.
>>
>> > By that logic, ArrayBuffer probably shouldn't be a collection, and
>> > perhaps making it convertible to one through .view or .iterator is enough?
>>
>> If we take this idea to its logical conclusion, we should abandon
>> `map` and friends on arrays because they are also mutable. This would
>> render obsolete virtually every Scala book in existence. Besides, it
>> would be counter-productive. Map and friends are perfectly good
>> methods on arrays.
>> So if we do not want to go there, where to draw the line?
>>
>>  - Martin
>>
>>
>>
>>
>> On Mon, Oct 5, 2015 at 10:18 PM, martin odersky <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="6_GMVMg8AQAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">martin....@...>
>> wrote:
>> > I have recently opened an issue inviting strawman designs for new
>> > collection architectures:
>> >
>> > <a href="https://github.com/lampepfl/dotty/issues/818" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2Flampepfl%2Fdotty%2Fissues%2F818\46sa\75D\46sntz\0751\46usg\75AFQjCNGOFHWkF3kPkL2sbfn7D1oRdfo2jg&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2Flampepfl%2Fdotty%2Fissues%2F818\46sa\75D\46sntz\0751\46usg\75AFQjCNGOFHWkF3kPkL2sbfn7D1oRdfo2jg&#39;;return true;">https://github.com/lampepfl/dotty/issues/818
>> >
>> > The issue lists some requirements for a design to be acceptable. Since
>> > then there has been a discussion about several points that are all
>> > outside the requirements outlined in #818. I think it's worthwhile to
>> > discuss these, but do not think #818 is the proper place to do it.
>> > That's why I would like to continue the discussion here.
>> >
>> >  - Martin
>> >
>> >
>> >
>> > --
>> > Martin Odersky
>> > EPFL
>>
>>
>>
>> --
>> Martin Odersky
>> EPFL
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "scala-debate" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="6_GMVMg8AQAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">scala-debate...@googlegroups.com.
>> For more options, visit <a href="https://groups.google.com/d/optout" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;">https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "scala-debate" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="6_GMVMg8AQAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">scala-debate...@googlegroups.com.
> For more options, visit <a href="https://groups.google.com/d/optout" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;">https://groups.google.com/d/optout.



--
Martin Odersky
EPFL

--
You received this message because you are subscribed to the Google Groups "scala-debate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Scope of redesigned collections

Haoyi Li
To play the devils advocate with what Martin said: we don't need to copy a mutable collection to make it immutable, we could just freeze it, i.e. disallow further updates.

Re-sending what I sent to martin to the list, I must've forgotten to reply-all

Would an explicit wrapper work in that case? e.g. 

def freeze(myArray: Array[T]): IndexedSeq[T]

This "want to turn mutable collection into immutable one with zero copy" use case happens, but I'd guess happens a lot less than the use case of "want to use mutable collection as mutable collection and not accidentally call expensive immutable methods", so we can optimize the common case to make it impossible to accidentally call immutable methods, while providing a short one-function helper to make the other use case convenient too

> Another possibility would be to go the Rust way, and use ownership types to ensure that iteration and mutation don't happen at the same time.

I'd love if this was the case but I doubt it will happen in next 2 years =P

 > I'm still not sure what is the root cause why we would want to remove operations on mutable collections (and from the sentence "Buffers append/remove is what you generally care about, but  they're lost in the huge sea of immutable methods you basically never  want", it's not clear to me which ones you want to see removed).
Could you clarify?

Basically everything that returns a new collection should not be part of the mutable API by default. The whole point of using mutable collections is to mutate them in place and not re-allocate all the time. If I wanted to re-allocate all over the place I'd just use an immutable collection. The number of times I've actually wanted to call map, filter, etc. on a mutable collection in the last 3 years has been zero.

As an exercise, compare:

Compare mutable.ArrayBuffer:


with java.util.ArrayList:


Get a stopwatch and answer the following questions:

- How do I remove every element satisfying some predicate from mutable.ArrayBuffer, in place?
- How do I remove every element satisfying some predicate from java.util.ArrayList, in place?

- How do sort the mutable.ArrayBuffer by some predicate, in place?
 -How do sort the java.util.ArrayList by some predicate, in place?

Time how long it takes you to answer each question (how to do it, or if it cannot be done) by browsing the list of methods. 

For me it takes >10x as long to answer each one for mutable.ArrayBuffer, due to the huge pile of "useless" copying methods.

SPOILER BELOW






































It turns out you can't perform either operation on a mutable.ArrayBuffer, which is pretty surprising given these are very common operations to want to do on any mutable collection. 


On Mon, Oct 5, 2015 at 2:09 PM, Aleksandar Prokopec <[hidden email]> wrote:


On Monday, 5 October 2015 22:54:03 UTC+2, Haoyi Li wrote:
But, what about ResizableArray. It looks like an array, so why deny it
the same operations?

Good question. Do people use it? I count 220 occurrences on github in Scala files vs 370895 occurrences for "Seq" vs 48,532 occurrences for "ArrayBuffer", so I'd argue the added weirdness in that case won't affect too many people

> There's the other valid use case where you build up a mutable
collection and once it is fully constructed want to use it as if it
was immutable. Copying to make it immutable is not permitted for
performance reasons. How would be support that?



To play the devils advocate with what Martin said: we don't need to copy a mutable collection to make it immutable, we could just freeze it, i.e. disallow further updates.
Another possibility would be to go the Rust way, and use ownership types to ensure that iteration and mutation don't happen at the same time.
 
@Haoyi LI:
I'm still not sure what is the root cause why we would want to remove operations on mutable collections (and from the sentence "Buffers append/remove is what you generally care about, but  they're lost in the huge sea of immutable methods you basically never  want", it's not clear to me which ones you want to see removed).
Could you clarify?
 
Isn't that what the XXXBuilder classes like VectorBuilder are for? If you want a builder you use a builder collection, not a mutable collection. Unless we're getting rid of builders?


There is more to mutable collections than just the builder pattern in which elements are monotonically added.
Maps and sets (although not in the strawman proposal) support remove and conditional removes.


On Mon, Oct 5, 2015 at 1:44 PM, martin odersky <[hidden email]> wrote:
But, what about ResizableArray. It looks like an array, so why deny it
the same operations?

There's the other valid use case where you build up a mutable
collection and once it is fully constructed want to use it as if it
was immutable. Copying to make it immutable is not permitted for
performance reasons. How would be support that?

 - Martin



On Mon, Oct 5, 2015 at 10:30 PM, Haoyi Li <[hidden email]> wrote:
>> So if we do not want to go there, where to draw the line?
>
> I say we draw the line between Arrays and everything else. They're already
> weird special things (reified generics, rubbish toString, rubbish equality,
> ...), so IMHO it's perfectly fine to have them be weird-special mutable
> objects with immutable operations. They're not even really collections in
> the first place!
>
> After that, immutable.* operations on mutable.T forall T in {Seq, Set, Map,
> Buffer, Queue, PriorityQueue, ...}  should be all killed.
>
> On Mon, Oct 5, 2015 at 1:23 PM, martin odersky <[hidden email]>
> wrote:
>>
>> Here is a first response to
>>
>> >> We gloss over the mutabile/immutable distinction
>>
>> > We probably want mutable collections to not have the same API as the
>> > immutable ones. One large problem with the current mutable collections is
>> > that by inheriting all the immutable methods, it's
>> neigh impossible to figure out where the "useful" mutable methods are.
>> e.g. Buffers append/remove is what you generally care about, but
>> they're lost in the huge sea of immutable methods you basically never
>> want. The same applies to mutable.Set, mutable.Queue, etc.
>>
>> > By that logic, ArrayBuffer probably shouldn't be a collection, and
>> > perhaps making it convertible to one through .view or .iterator is enough?
>>
>> If we take this idea to its logical conclusion, we should abandon
>> `map` and friends on arrays because they are also mutable. This would
>> render obsolete virtually every Scala book in existence. Besides, it
>> would be counter-productive. Map and friends are perfectly good
>> methods on arrays.
>> So if we do not want to go there, where to draw the line?
>>
>>  - Martin
>>
>>
>>
>>
>> On Mon, Oct 5, 2015 at 10:18 PM, martin odersky <[hidden email]>
>> wrote:
>> > I have recently opened an issue inviting strawman designs for new
>> > collection architectures:
>> >
>> > https://github.com/lampepfl/dotty/issues/818
>> >
>> > The issue lists some requirements for a design to be acceptable. Since
>> > then there has been a discussion about several points that are all
>> > outside the requirements outlined in #818. I think it's worthwhile to
>> > discuss these, but do not think #818 is the proper place to do it.
>> > That's why I would like to continue the discussion here.
>> >
>> >  - Martin
>> >
>> >
>> >
>> > --
>> > Martin Odersky
>> > EPFL
>>
>>
>>
>> --
>> Martin Odersky
>> EPFL
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "scala-debate" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [hidden email].
>> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "scala-debate" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [hidden email].
> For more options, visit https://groups.google.com/d/optout.



--
Martin Odersky
EPFL

--
You received this message because you are subscribed to the Google Groups "scala-debate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "scala-debate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Scope of redesigned collections

Aleksandar Prokopec


On Monday, 5 October 2015 23:26:13 UTC+2, Haoyi Li wrote:
To play the devils advocate with what Martin said: we don't need to copy a mutable collection to make it immutable, we could just freeze it, i.e. disallow further updates.

Re-sending what I sent to martin to the list, I must've forgotten to reply-all

Would an explicit wrapper work in that case? e.g. 

def freeze(myArray: Array[T]): IndexedSeq[T]

This "want to turn mutable collection into immutable one with zero copy" use case happens, but I'd guess happens a lot less than the use case of "want to use mutable collection as mutable collection and not accidentally call expensive immutable methods", so we can optimize the common case to make it impossible to accidentally call immutable methods, while providing a short one-function helper to make the other use case convenient too


Agreed about the two use-cases.
 
> Another possibility would be to go the Rust way, and use ownership types to ensure that iteration and mutation don't happen at the same time.

I'd love if this was the case but I doubt it will happen in next 2 years =P

 > I'm still not sure what is the root cause why we would want to remove operations on mutable collections (and from the sentence "Buffers append/remove is what you generally care about, but  they're lost in the huge sea of immutable methods you basically never  want", it's not clear to me which ones you want to see removed).
Could you clarify?

Basically everything that returns a new collection should not be part of the mutable API by default.

I like the part here where you say "by default". :)
If we consider bulk operations (i.e. map, filter, flatMap, sorted, ...) and other copying operations (:+, +:, and similar smiley methods) dangerous because they incur serious performance penalties, we could disallow them on mutable collections.
But, I'm sure that some people have valid use cases where they find these methods useful, and they should be able to enable them via import of a decorator or a typeclass.

This design seems to go hand-in-hand with what implicit decorators currently in the strawman proposal.

(as a side note, I find it funny how we go at great lengths to prevent our users from shooting themselves in the foot)

The whole point of using mutable collections is to mutate them in place and not re-allocate all the time. If I wanted to re-allocate all over the place I'd just use an immutable collection. The number of times I've actually wanted to call map, filter, etc. on a mutable collection in the last 3 years has been zero.


Still, people sometimes return e.g. ArrayBuffers from interface methods that declare that they return a Seq - and ArrayBuffers are one of the more efficient sequence implementations.
And once you have an immutable view (Seq) of a mutable collection (ArrayBuffer), it's easy to do that mistake.
Note that the current strawman says ArrayBuffer extends Seq.
Should a conversion from a mutable collection to a sequence also be enabled through an explicitly imported implicit conversion?
 
As an exercise, compare:

Compare mutable.ArrayBuffer:

<a href="http://www.scala-lang.org/api/current/scala/collection/mutable/ArrayBuffer.html" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fwww.scala-lang.org%2Fapi%2Fcurrent%2Fscala%2Fcollection%2Fmutable%2FArrayBuffer.html\46sa\75D\46sntz\0751\46usg\75AFQjCNGmgmp-ix83Evof7W8utLMv8jq0eA&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fwww.scala-lang.org%2Fapi%2Fcurrent%2Fscala%2Fcollection%2Fmutable%2FArrayBuffer.html\46sa\75D\46sntz\0751\46usg\75AFQjCNGmgmp-ix83Evof7W8utLMv8jq0eA&#39;;return true;">http://www.scala-lang.org/api/current/scala/collection/mutable/ArrayBuffer.html

with java.util.ArrayList:

<a href="https://docs.oracle.com/javase/8/docs/api/java/util/ArrayList.html" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fdocs.oracle.com%2Fjavase%2F8%2Fdocs%2Fapi%2Fjava%2Futil%2FArrayList.html\46sa\75D\46sntz\0751\46usg\75AFQjCNGdCCPJ_BVyLV0xTDVSYEO6u7XtWg&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fdocs.oracle.com%2Fjavase%2F8%2Fdocs%2Fapi%2Fjava%2Futil%2FArrayList.html\46sa\75D\46sntz\0751\46usg\75AFQjCNGdCCPJ_BVyLV0xTDVSYEO6u7XtWg&#39;;return true;">https://docs.oracle.com/javase/8/docs/api/java/util/ArrayList.html

Get a stopwatch and answer the following questions:

- How do I remove every element satisfying some predicate from mutable.ArrayBuffer, in place?
- How do I remove every element satisfying some predicate from java.util.ArrayList, in place?

- How do sort the mutable.ArrayBuffer by some predicate, in place?
 -How do sort the java.util.ArrayList by some predicate, in place?

Time how long it takes you to answer each question (how to do it, or if it cannot be done) by browsing the list of methods. 

For me it takes >10x as long to answer each one for mutable.ArrayBuffer, due to the huge pile of "useless" copying methods.

SPOILER BELOW






































It turns out you can't perform either operation on a mutable.ArrayBuffer, which is pretty surprising given these are very common operations to want to do on any mutable collection. 


This could be addressed with another family of decorator methods that are specific to mutable collections.
Is that what you would propose?
 

On Mon, Oct 5, 2015 at 2:09 PM, Aleksandar Prokopec <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="pjNTi4k-AQAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">aleksanda...@...> wrote:


On Monday, 5 October 2015 22:54:03 UTC+2, Haoyi Li wrote:
But, what about ResizableArray. It looks like an array, so why deny it
the same operations?

Good question. Do people use it? I count 220 occurrences on github in Scala files vs 370895 occurrences for "Seq" vs 48,532 occurrences for "ArrayBuffer", so I'd argue the added weirdness in that case won't affect too many people

> There's the other valid use case where you build up a mutable
collection and once it is fully constructed want to use it as if it
was immutable. Copying to make it immutable is not permitted for
performance reasons. How would be support that?



To play the devils advocate with what Martin said: we don't need to copy a mutable collection to make it immutable, we could just freeze it, i.e. disallow further updates.
Another possibility would be to go the Rust way, and use ownership types to ensure that iteration and mutation don't happen at the same time.
 
@Haoyi LI:
I'm still not sure what is the root cause why we would want to remove operations on mutable collections (and from the sentence "Buffers append/remove is what you generally care about, but  they're lost in the huge sea of immutable methods you basically never  want", it's not clear to me which ones you want to see removed).
Could you clarify?
 
Isn't that what the XXXBuilder classes like <a href="http://www.scala-lang.org/files/archive/nightly/docs/library/index.html#scala.collection.immutable.VectorBuilder" rel="nofollow" target="_blank" onmousedown="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fwww.scala-lang.org%2Ffiles%2Farchive%2Fnightly%2Fdocs%2Flibrary%2Findex.html%23scala.collection.immutable.VectorBuilder\46sa\75D\46sntz\0751\46usg\75AFQjCNHrAETY4RVatfSgxbuHCj-xRf_8lQ&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fwww.scala-lang.org%2Ffiles%2Farchive%2Fnightly%2Fdocs%2Flibrary%2Findex.html%23scala.collection.immutable.VectorBuilder\46sa\75D\46sntz\0751\46usg\75AFQjCNHrAETY4RVatfSgxbuHCj-xRf_8lQ&#39;;return true;">VectorBuilder are for? If you want a builder you use a builder collection, not a mutable collection. Unless we're getting rid of builders?


There is more to mutable collections than just the builder pattern in which elements are monotonically added.
Maps and sets (although not in the strawman proposal) support remove and conditional removes.


On Mon, Oct 5, 2015 at 1:44 PM, martin odersky <[hidden email]> wrote:
But, what about ResizableArray. It looks like an array, so why deny it
the same operations?

There's the other valid use case where you build up a mutable
collection and once it is fully constructed want to use it as if it
was immutable. Copying to make it immutable is not permitted for
performance reasons. How would be support that?

 - Martin



On Mon, Oct 5, 2015 at 10:30 PM, Haoyi Li <[hidden email]> wrote:
>> So if we do not want to go there, where to draw the line?
>
> I say we draw the line between Arrays and everything else. They're already
> weird special things (reified generics, rubbish toString, rubbish equality,
> ...), so IMHO it's perfectly fine to have them be weird-special mutable
> objects with immutable operations. They're not even really collections in
> the first place!
>
> After that, immutable.* operations on mutable.T forall T in {Seq, Set, Map,
> Buffer, Queue, PriorityQueue, ...}  should be all killed.
>
> On Mon, Oct 5, 2015 at 1:23 PM, martin odersky <[hidden email]>
> wrote:
>>
>> Here is a first response to
>>
>> >> We gloss over the mutabile/immutable distinction
>>
>> > We probably want mutable collections to not have the same API as the
>> > immutable ones. One large problem with the current mutable collections is
>> > that by inheriting all the immutable methods, it's
>> neigh impossible to figure out where the "useful" mutable methods are.
>> e.g. Buffers append/remove is what you generally care about, but
>> they're lost in the huge sea of immutable methods you basically never
>> want. The same applies to mutable.Set, mutable.Queue, etc.
>>
>> > By that logic, ArrayBuffer probably shouldn't be a collection, and
>> > perhaps making it convertible to one through .view or .iterator is enough?
>>
>> If we take this idea to its logical conclusion, we should abandon
>> `map` and friends on arrays because they are also mutable. This would
>> render obsolete virtually every Scala book in existence. Besides, it
>> would be counter-productive. Map and friends are perfectly good
>> methods on arrays.
>> So if we do not want to go there, where to draw the line?
>>
>>  - Martin
>>
>>
>>
>>
>> On Mon, Oct 5, 2015 at 10:18 PM, martin odersky <[hidden email]>
>> wrote:
>> > I have recently opened an issue inviting strawman designs for new
>> > collection architectures:
>> >
>> > <a href="https://github.com/lampepfl/dotty/issues/818" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2Flampepfl%2Fdotty%2Fissues%2F818\46sa\75D\46sntz\0751\46usg\75AFQjCNGOFHWkF3kPkL2sbfn7D1oRdfo2jg&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2Flampepfl%2Fdotty%2Fissues%2F818\46sa\75D\46sntz\0751\46usg\75AFQjCNGOFHWkF3kPkL2sbfn7D1oRdfo2jg&#39;;return true;">https://github.com/lampepfl/dotty/issues/818
>> >
>> > The issue lists some requirements for a design to be acceptable. Since
>> > then there has been a discussion about several points that are all
>> > outside the requirements outlined in #818. I think it's worthwhile to
>> > discuss these, but do not think #818 is the proper place to do it.
>> > That's why I would like to continue the discussion here.
>> >
>> >  - Martin
>> >
>> >
>> >
>> > --
>> > Martin Odersky
>> > EPFL
>>
>>
>>
>> --
>> Martin Odersky
>> EPFL
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "scala-debate" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to scala-debate...@googlegroups.com.
>> For more options, visit <a href="https://groups.google.com/d/optout" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;">https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "scala-debate" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to scala-debate...@googlegroups.com.
> For more options, visit <a href="https://groups.google.com/d/optout" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;">https://groups.google.com/d/optout.



--
Martin Odersky
EPFL

--
You received this message because you are subscribed to the Google Groups "scala-debate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="pjNTi4k-AQAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">scala-debate...@googlegroups.com.
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "scala-debate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Scope of redesigned collections

Haoyi Li
But, I'm sure that some people have valid use cases where they find these methods useful,

Let's skip the hypotheticals and ask: who are these people and what are their use cases? In theory everything is useful to someone somewhere maybe. In the end there's always an empirical judgement of how many people you think will want this functionality. "Non-zero, maybe" isn't satisfactory =P 

Could we kick of community builds with instrumentation via Abide to see what the actual usage numbers on the various APIs are? That would help give us something concrete to talk about, and perhaps suggest alternatives for the use cases we do find.

as a side note, I find it funny how we go at great lengths to prevent our users from shooting themselves in the foot

While Scala's immutable collections are nice, Scala's mutable collections API is one of the most easily-foot-shooting APIs I know of. 100s of methods you don't want, barely any methods you want, missing several methods you *do* want. 

Python, Java, and many other languages provide more ergonomic mutable collections. Let's not be too proud of ourselves =D 

Should a conversion from a mutable collection to a sequence also be enabled through an explicitly imported implicit conversion?

What's wrong with just an explicit conversion?

def freeze(input: mutable.Seq[T]): immutable.Seq[T]

Not everything needs to be an implicit.


On Mon, Oct 5, 2015 at 2:47 PM, Aleksandar Prokopec <[hidden email]> wrote:


On Monday, 5 October 2015 23:26:13 UTC+2, Haoyi Li wrote:
To play the devils advocate with what Martin said: we don't need to copy a mutable collection to make it immutable, we could just freeze it, i.e. disallow further updates.

Re-sending what I sent to martin to the list, I must've forgotten to reply-all

Would an explicit wrapper work in that case? e.g. 

def freeze(myArray: Array[T]): IndexedSeq[T]

This "want to turn mutable collection into immutable one with zero copy" use case happens, but I'd guess happens a lot less than the use case of "want to use mutable collection as mutable collection and not accidentally call expensive immutable methods", so we can optimize the common case to make it impossible to accidentally call immutable methods, while providing a short one-function helper to make the other use case convenient too


Agreed about the two use-cases.
 
> Another possibility would be to go the Rust way, and use ownership types to ensure that iteration and mutation don't happen at the same time.

I'd love if this was the case but I doubt it will happen in next 2 years =P

 > I'm still not sure what is the root cause why we would want to remove operations on mutable collections (and from the sentence "Buffers append/remove is what you generally care about, but  they're lost in the huge sea of immutable methods you basically never  want", it's not clear to me which ones you want to see removed).
Could you clarify?

Basically everything that returns a new collection should not be part of the mutable API by default.

I like the part here where you say "by default". :)
If we consider bulk operations (i.e. map, filter, flatMap, sorted, ...) and other copying operations (:+, +:, and similar smiley methods) dangerous because they incur serious performance penalties, we could disallow them on mutable collections.
But, I'm sure that some people have valid use cases where they find these methods useful, and they should be able to enable them via import of a decorator or a typeclass.

This design seems to go hand-in-hand with what implicit decorators currently in the strawman proposal.

(as a side note, I find it funny how we go at great lengths to prevent our users from shooting themselves in the foot)

The whole point of using mutable collections is to mutate them in place and not re-allocate all the time. If I wanted to re-allocate all over the place I'd just use an immutable collection. The number of times I've actually wanted to call map, filter, etc. on a mutable collection in the last 3 years has been zero.


Still, people sometimes return e.g. ArrayBuffers from interface methods that declare that they return a Seq - and ArrayBuffers are one of the more efficient sequence implementations.
And once you have an immutable view (Seq) of a mutable collection (ArrayBuffer), it's easy to do that mistake.
Note that the current strawman says ArrayBuffer extends Seq.
Should a conversion from a mutable collection to a sequence also be enabled through an explicitly imported implicit conversion?
 
As an exercise, compare:

Compare mutable.ArrayBuffer:


with java.util.ArrayList:


Get a stopwatch and answer the following questions:

- How do I remove every element satisfying some predicate from mutable.ArrayBuffer, in place?
- How do I remove every element satisfying some predicate from java.util.ArrayList, in place?

- How do sort the mutable.ArrayBuffer by some predicate, in place?
 -How do sort the java.util.ArrayList by some predicate, in place?

Time how long it takes you to answer each question (how to do it, or if it cannot be done) by browsing the list of methods. 

For me it takes >10x as long to answer each one for mutable.ArrayBuffer, due to the huge pile of "useless" copying methods.

SPOILER BELOW






































It turns out you can't perform either operation on a mutable.ArrayBuffer, which is pretty surprising given these are very common operations to want to do on any mutable collection. 


This could be addressed with another family of decorator methods that are specific to mutable collections.
Is that what you would propose?
 

On Mon, Oct 5, 2015 at 2:09 PM, Aleksandar Prokopec <[hidden email]> wrote:


On Monday, 5 October 2015 22:54:03 UTC+2, Haoyi Li wrote:
But, what about ResizableArray. It looks like an array, so why deny it
the same operations?

Good question. Do people use it? I count 220 occurrences on github in Scala files vs 370895 occurrences for "Seq" vs 48,532 occurrences for "ArrayBuffer", so I'd argue the added weirdness in that case won't affect too many people

> There's the other valid use case where you build up a mutable
collection and once it is fully constructed want to use it as if it
was immutable. Copying to make it immutable is not permitted for
performance reasons. How would be support that?



To play the devils advocate with what Martin said: we don't need to copy a mutable collection to make it immutable, we could just freeze it, i.e. disallow further updates.
Another possibility would be to go the Rust way, and use ownership types to ensure that iteration and mutation don't happen at the same time.
 
@Haoyi LI:
I'm still not sure what is the root cause why we would want to remove operations on mutable collections (and from the sentence "Buffers append/remove is what you generally care about, but  they're lost in the huge sea of immutable methods you basically never  want", it's not clear to me which ones you want to see removed).
Could you clarify?
 
Isn't that what the XXXBuilder classes like VectorBuilder are for? If you want a builder you use a builder collection, not a mutable collection. Unless we're getting rid of builders?


There is more to mutable collections than just the builder pattern in which elements are monotonically added.
Maps and sets (although not in the strawman proposal) support remove and conditional removes.


On Mon, Oct 5, 2015 at 1:44 PM, martin odersky <[hidden email]> wrote:
But, what about ResizableArray. It looks like an array, so why deny it
the same operations?

There's the other valid use case where you build up a mutable
collection and once it is fully constructed want to use it as if it
was immutable. Copying to make it immutable is not permitted for
performance reasons. How would be support that?

 - Martin



On Mon, Oct 5, 2015 at 10:30 PM, Haoyi Li <[hidden email]> wrote:
>> So if we do not want to go there, where to draw the line?
>
> I say we draw the line between Arrays and everything else. They're already
> weird special things (reified generics, rubbish toString, rubbish equality,
> ...), so IMHO it's perfectly fine to have them be weird-special mutable
> objects with immutable operations. They're not even really collections in
> the first place!
>
> After that, immutable.* operations on mutable.T forall T in {Seq, Set, Map,
> Buffer, Queue, PriorityQueue, ...}  should be all killed.
>
> On Mon, Oct 5, 2015 at 1:23 PM, martin odersky <[hidden email]>
> wrote:
>>
>> Here is a first response to
>>
>> >> We gloss over the mutabile/immutable distinction
>>
>> > We probably want mutable collections to not have the same API as the
>> > immutable ones. One large problem with the current mutable collections is
>> > that by inheriting all the immutable methods, it's
>> neigh impossible to figure out where the "useful" mutable methods are.
>> e.g. Buffers append/remove is what you generally care about, but
>> they're lost in the huge sea of immutable methods you basically never
>> want. The same applies to mutable.Set, mutable.Queue, etc.
>>
>> > By that logic, ArrayBuffer probably shouldn't be a collection, and
>> > perhaps making it convertible to one through .view or .iterator is enough?
>>
>> If we take this idea to its logical conclusion, we should abandon
>> `map` and friends on arrays because they are also mutable. This would
>> render obsolete virtually every Scala book in existence. Besides, it
>> would be counter-productive. Map and friends are perfectly good
>> methods on arrays.
>> So if we do not want to go there, where to draw the line?
>>
>>  - Martin
>>
>>
>>
>>
>> On Mon, Oct 5, 2015 at 10:18 PM, martin odersky <[hidden email]>
>> wrote:
>> > I have recently opened an issue inviting strawman designs for new
>> > collection architectures:
>> >
>> > https://github.com/lampepfl/dotty/issues/818
>> >
>> > The issue lists some requirements for a design to be acceptable. Since
>> > then there has been a discussion about several points that are all
>> > outside the requirements outlined in #818. I think it's worthwhile to
>> > discuss these, but do not think #818 is the proper place to do it.
>> > That's why I would like to continue the discussion here.
>> >
>> >  - Martin
>> >
>> >
>> >
>> > --
>> > Martin Odersky
>> > EPFL
>>
>>
>>
>> --
>> Martin Odersky
>> EPFL
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "scala-debate" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [hidden email].
>> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "scala-debate" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [hidden email].
> For more options, visit https://groups.google.com/d/optout.



--
Martin Odersky
EPFL

--
You received this message because you are subscribed to the Google Groups "scala-debate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "scala-debate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "scala-debate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Scope of redesigned collections

Aleksandar Prokopec


On Mon, Oct 5, 2015 at 11:55 PM, Haoyi Li <[hidden email]> wrote:
But, I'm sure that some people have valid use cases where they find these methods useful,

Let's skip the hypotheticals and ask: who are these people and what are their use cases? In theory everything is useful to someone somewhere maybe. In the end there's always an empirical judgement of how many people you think will want this functionality. "Non-zero, maybe" isn't satisfactory =P 

Could we kick of community builds with instrumentation via Abide to see what the actual usage numbers on the various APIs are? That would help give us something concrete to talk about, and perhaps suggest alternatives for the use cases we do find.


Agreed - it would be great to have numbers on this.
 
as a side note, I find it funny how we go at great lengths to prevent our users from shooting themselves in the foot

While Scala's immutable collections are nice, Scala's mutable collections API is one of the most easily-foot-shooting APIs I know of. 100s of methods you don't want, barely any methods you want, missing several methods you *do* want. 

Python, Java, and many other languages provide more ergonomic mutable collections. Let's not be too proud of ourselves =D 


Why do you say they are more ergonomic? In which ways?
Python, for example, also has mutable lists and dictionaries, and for-comprehensions that copy those collections.

Aside from numbers, I guess it would be helpful to build up a knowledge-base of problematic examples.
 
Should a conversion from a mutable collection to a sequence also be enabled through an explicitly imported implicit conversion?

What's wrong with just an explicit conversion?

def freeze(input: mutable.Seq[T]): immutable.Seq[T]

Not everything needs to be an implicit.


Could work. Although - it adds some syntactic overhead. And also some performance overhead if it's a copying freeze (otherwise, ownership of the mutable collection would have to relinquished somehow).
 

On Mon, Oct 5, 2015 at 2:47 PM, Aleksandar Prokopec <[hidden email]> wrote:


On Monday, 5 October 2015 23:26:13 UTC+2, Haoyi Li wrote:
To play the devils advocate with what Martin said: we don't need to copy a mutable collection to make it immutable, we could just freeze it, i.e. disallow further updates.

Re-sending what I sent to martin to the list, I must've forgotten to reply-all

Would an explicit wrapper work in that case? e.g. 

def freeze(myArray: Array[T]): IndexedSeq[T]

This "want to turn mutable collection into immutable one with zero copy" use case happens, but I'd guess happens a lot less than the use case of "want to use mutable collection as mutable collection and not accidentally call expensive immutable methods", so we can optimize the common case to make it impossible to accidentally call immutable methods, while providing a short one-function helper to make the other use case convenient too


Agreed about the two use-cases.
 
> Another possibility would be to go the Rust way, and use ownership types to ensure that iteration and mutation don't happen at the same time.

I'd love if this was the case but I doubt it will happen in next 2 years =P

 > I'm still not sure what is the root cause why we would want to remove operations on mutable collections (and from the sentence "Buffers append/remove is what you generally care about, but  they're lost in the huge sea of immutable methods you basically never  want", it's not clear to me which ones you want to see removed).
Could you clarify?

Basically everything that returns a new collection should not be part of the mutable API by default.

I like the part here where you say "by default". :)
If we consider bulk operations (i.e. map, filter, flatMap, sorted, ...) and other copying operations (:+, +:, and similar smiley methods) dangerous because they incur serious performance penalties, we could disallow them on mutable collections.
But, I'm sure that some people have valid use cases where they find these methods useful, and they should be able to enable them via import of a decorator or a typeclass.

This design seems to go hand-in-hand with what implicit decorators currently in the strawman proposal.

(as a side note, I find it funny how we go at great lengths to prevent our users from shooting themselves in the foot)

The whole point of using mutable collections is to mutate them in place and not re-allocate all the time. If I wanted to re-allocate all over the place I'd just use an immutable collection. The number of times I've actually wanted to call map, filter, etc. on a mutable collection in the last 3 years has been zero.


Still, people sometimes return e.g. ArrayBuffers from interface methods that declare that they return a Seq - and ArrayBuffers are one of the more efficient sequence implementations.
And once you have an immutable view (Seq) of a mutable collection (ArrayBuffer), it's easy to do that mistake.
Note that the current strawman says ArrayBuffer extends Seq.
Should a conversion from a mutable collection to a sequence also be enabled through an explicitly imported implicit conversion?
 
As an exercise, compare:

Compare mutable.ArrayBuffer:


with java.util.ArrayList:


Get a stopwatch and answer the following questions:

- How do I remove every element satisfying some predicate from mutable.ArrayBuffer, in place?
- How do I remove every element satisfying some predicate from java.util.ArrayList, in place?

- How do sort the mutable.ArrayBuffer by some predicate, in place?
 -How do sort the java.util.ArrayList by some predicate, in place?

Time how long it takes you to answer each question (how to do it, or if it cannot be done) by browsing the list of methods. 

For me it takes >10x as long to answer each one for mutable.ArrayBuffer, due to the huge pile of "useless" copying methods.

SPOILER BELOW






































It turns out you can't perform either operation on a mutable.ArrayBuffer, which is pretty surprising given these are very common operations to want to do on any mutable collection. 


This could be addressed with another family of decorator methods that are specific to mutable collections.
Is that what you would propose?
 

On Mon, Oct 5, 2015 at 2:09 PM, Aleksandar Prokopec <[hidden email]> wrote:


On Monday, 5 October 2015 22:54:03 UTC+2, Haoyi Li wrote:
But, what about ResizableArray. It looks like an array, so why deny it
the same operations?

Good question. Do people use it? I count 220 occurrences on github in Scala files vs 370895 occurrences for "Seq" vs 48,532 occurrences for "ArrayBuffer", so I'd argue the added weirdness in that case won't affect too many people

> There's the other valid use case where you build up a mutable
collection and once it is fully constructed want to use it as if it
was immutable. Copying to make it immutable is not permitted for
performance reasons. How would be support that?



To play the devils advocate with what Martin said: we don't need to copy a mutable collection to make it immutable, we could just freeze it, i.e. disallow further updates.
Another possibility would be to go the Rust way, and use ownership types to ensure that iteration and mutation don't happen at the same time.
 
@Haoyi LI:
I'm still not sure what is the root cause why we would want to remove operations on mutable collections (and from the sentence "Buffers append/remove is what you generally care about, but  they're lost in the huge sea of immutable methods you basically never  want", it's not clear to me which ones you want to see removed).
Could you clarify?
 
Isn't that what the XXXBuilder classes like VectorBuilder are for? If you want a builder you use a builder collection, not a mutable collection. Unless we're getting rid of builders?


There is more to mutable collections than just the builder pattern in which elements are monotonically added.
Maps and sets (although not in the strawman proposal) support remove and conditional removes.


On Mon, Oct 5, 2015 at 1:44 PM, martin odersky <[hidden email]> wrote:
But, what about ResizableArray. It looks like an array, so why deny it
the same operations?

There's the other valid use case where you build up a mutable
collection and once it is fully constructed want to use it as if it
was immutable. Copying to make it immutable is not permitted for
performance reasons. How would be support that?

 - Martin



On Mon, Oct 5, 2015 at 10:30 PM, Haoyi Li <[hidden email]> wrote:
>> So if we do not want to go there, where to draw the line?
>
> I say we draw the line between Arrays and everything else. They're already
> weird special things (reified generics, rubbish toString, rubbish equality,
> ...), so IMHO it's perfectly fine to have them be weird-special mutable
> objects with immutable operations. They're not even really collections in
> the first place!
>
> After that, immutable.* operations on mutable.T forall T in {Seq, Set, Map,
> Buffer, Queue, PriorityQueue, ...}  should be all killed.
>
> On Mon, Oct 5, 2015 at 1:23 PM, martin odersky <[hidden email]>
> wrote:
>>
>> Here is a first response to
>>
>> >> We gloss over the mutabile/immutable distinction
>>
>> > We probably want mutable collections to not have the same API as the
>> > immutable ones. One large problem with the current mutable collections is
>> > that by inheriting all the immutable methods, it's
>> neigh impossible to figure out where the "useful" mutable methods are.
>> e.g. Buffers append/remove is what you generally care about, but
>> they're lost in the huge sea of immutable methods you basically never
>> want. The same applies to mutable.Set, mutable.Queue, etc.
>>
>> > By that logic, ArrayBuffer probably shouldn't be a collection, and
>> > perhaps making it convertible to one through .view or .iterator is enough?
>>
>> If we take this idea to its logical conclusion, we should abandon
>> `map` and friends on arrays because they are also mutable. This would
>> render obsolete virtually every Scala book in existence. Besides, it
>> would be counter-productive. Map and friends are perfectly good
>> methods on arrays.
>> So if we do not want to go there, where to draw the line?
>>
>>  - Martin
>>
>>
>>
>>
>> On Mon, Oct 5, 2015 at 10:18 PM, martin odersky <[hidden email]>
>> wrote:
>> > I have recently opened an issue inviting strawman designs for new
>> > collection architectures:
>> >
>> > https://github.com/lampepfl/dotty/issues/818
>> >
>> > The issue lists some requirements for a design to be acceptable. Since
>> > then there has been a discussion about several points that are all
>> > outside the requirements outlined in #818. I think it's worthwhile to
>> > discuss these, but do not think #818 is the proper place to do it.
>> > That's why I would like to continue the discussion here.
>> >
>> >  - Martin
>> >
>> >
>> >
>> > --
>> > Martin Odersky
>> > EPFL
>>
>>
>>
>> --
>> Martin Odersky
>> EPFL
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "scala-debate" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [hidden email].
>> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "scala-debate" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [hidden email].
> For more options, visit https://groups.google.com/d/optout.



--
Martin Odersky
EPFL

--
You received this message because you are subscribed to the Google Groups "scala-debate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "scala-debate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.


--
You received this message because you are subscribed to the Google Groups "scala-debate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Scope of redesigned collections

Simon Ochsenreither-3
In reply to this post by Martin Odersky
Given the scope, I would strongly suggest abandoning the collection redesign. It doesn't address the pain points we should worry about, and the restrictions mean it is at best yet-another temporary measure.

This feels like a great recipe for a Python 2/Python3 disaster: Too many small, subtle changes which will bite people with not enough associated benefit to encourage strong adoption.

If we really want to do something, I would probably prefer working on Josh's views, despite not being enamored about it: At least it's made clear that they are provisional band-aids.

But the scope just means that we will have the same discussion a few years down the road, and I prefer breaking people's code once, not twice.

--
You received this message because you are subscribed to the Google Groups "scala-debate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Scope of redesigned collections

Martin Odersky
On Tue, Oct 6, 2015 at 12:16 AM, Simon Ochsenreither
<[hidden email]> wrote:

> Given the scope, I would strongly suggest abandoning the collection
> redesign. It doesn't address the pain points we should worry about, and the
> restrictions mean it is at best yet-another temporary measure.
>
> This feels like a great recipe for a Python 2/Python3 disaster: Too many
> small, subtle changes which will bite people with not enough associated
> benefit to encourage strong adoption.
>
> If we really want to do something, I would probably prefer working on Josh's
> views, despite not being enamored about it: At least it's made clear that
> they are provisional band-aids.
>
> But the scope just means that we will have the same discussion a few years
> down the road, and I prefer breaking people's code once, not twice.

What specifically do you object to?

 - Martin

>
> --
> You received this message because you are subscribed to the Google Groups
> "scala-debate" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [hidden email].
> For more options, visit https://groups.google.com/d/optout.



--
Martin Odersky
EPFL

--
You received this message because you are subscribed to the Google Groups "scala-debate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Scope of redesigned collections

Haoyi Li
In reply to this post by Simon Ochsenreither-3
Why do you say they are more ergonomic? In which ways?

Here's what you get when you want the methods on a python list:


Inline image 1

A small number of operations, basically all of which are useful to everyone who wants a mutable list. And all can be understood by anyone who has ever programmed before, without reading any docs! 

Here's what you get when you ask for the methods on a scala mutable buffer:

Inline image 2

How many of these are methods that someone can call on a mutable buffer without horrendous performance implications? Less than half. In fact, methods which look very similar (++ and ++=) have totally different semantics and performance implications. 

As an exercise, can you give me which methods on that list will result in O(n) copying of the whole collection, which is basically what you never want?

Here are a few that you don't want to ever call if you want mutable-seq performance

++
++:
+:
-
--
collect
diff
distinct
drop
dropRight
dropWhile
filter
filterNot
flatMap
groupBy
grouped


Here are a few that you *do* want

++=
+=
collectFirst
append
appendAll
fold{Left, Right}
forall
foreach
count
clear()

Here are a few which are just wtf:

<<(cmd: Message[Int])
+(other: String)
addString(b: StringBuilder)





On Mon, Oct 5, 2015 at 3:16 PM, Simon Ochsenreither <[hidden email]> wrote:
Given the scope, I would strongly suggest abandoning the collection redesign. It doesn't address the pain points we should worry about, and the restrictions mean it is at best yet-another temporary measure.

This feels like a great recipe for a Python 2/Python3 disaster: Too many small, subtle changes which will bite people with not enough associated benefit to encourage strong adoption.

If we really want to do something, I would probably prefer working on Josh's views, despite not being enamored about it: At least it's made clear that they are provisional band-aids.

But the scope just means that we will have the same discussion a few years down the road, and I prefer breaking people's code once, not twice.

--
You received this message because you are subscribed to the Google Groups "scala-debate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "scala-debate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Scope of redesigned collections

Aleksandar Prokopec


On Tuesday, 6 October 2015 00:34:14 UTC+2, Haoyi Li wrote:
Why do you say they are more ergonomic? In which ways?

Here's what you get when you want the methods on a python list:


Inline image 1

A small number of operations, basically all of which are useful to everyone who wants a mutable list. And all can be understood by anyone who has ever programmed before, without reading any docs! 


That's a bit unfair comparison, because it does not include Python's built-in functions, many of which are list-related:

https://docs.python.org/2/library/functions.html

That adds more functions to the table.
Plus, there's Python's itertools, which has more list-related methods:

https://docs.python.org/2/library/itertools.html
 
Here's what you get when you ask for the methods on a scala mutable buffer:

Inline image 2

How many of these are methods that someone can call on a mutable buffer without horrendous performance implications? Less than half. In fact, methods which look very similar (++ and ++=) have totally different semantics and performance implications. 


A couple of remarks here:

1. There are many use-cases where one deals with small collections, comprised of several elements, where performance implications are less important.
2. There are use-cases where copying is less expensive than the actual useful work performed on the collection.
3. Non-strict map, flatMap, filter, ... could reduce or eliminate performance overhead due to copying.
4. The optimizer could fuse many of these operations, or eliminate copying, in particular those where a copying bulk operations is followed by a fold.
 
As an exercise, can you give me which methods on that list will result in O(n) copying of the whole collection, which is basically what you never want?


I'm not excluding that you might be right about "basically what you never want", but as you pointed out yourself - we would need numbers to make definite claims about this.
 
Here are a few that you don't want to ever call if you want mutable-seq performance

++
++:
+:
-
--
collect
diff
distinct
drop
dropRight
dropWhile
filter
filterNot
flatMap
groupBy
grouped


Here are a few that you *do* want

++=
+=
collectFirst
append
appendAll
fold{Left, Right}
forall
foreach
count
clear()

Here are a few which are just wtf:

<<(cmd: Message[Int])
+(other: String)
addString(b: StringBuilder)





On Mon, Oct 5, 2015 at 3:16 PM, Simon Ochsenreither <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="O2NN3z9CAQAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">simon.och...@...> wrote:
Given the scope, I would strongly suggest abandoning the collection redesign. It doesn't address the pain points we should worry about, and the restrictions mean it is at best yet-another temporary measure.

This feels like a great recipe for a Python 2/Python3 disaster: Too many small, subtle changes which will bite people with not enough associated benefit to encourage strong adoption.

If we really want to do something, I would probably prefer working on Josh's views, despite not being enamored about it: At least it's made clear that they are provisional band-aids.

But the scope just means that we will have the same discussion a few years down the road, and I prefer breaking people's code once, not twice.

--
You received this message because you are subscribed to the Google Groups "scala-debate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="O2NN3z9CAQAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">scala-debate...@googlegroups.com.
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "scala-debate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Scope of redesigned collections

Simon Ochsenreither-3
In reply to this post by Martin Odersky
The approach doesn't
  • address the issue that everyone rolls their own "collection-like" API, because the API makes unnecessary prescriptions about how things need to work
  • support efficient execution by default, leading to a slow-but-convenient API (which would have already fallen into disuse if views ever worked reliably), which repeats all the mistakes we already made
  • reduce the duplication within the API by keeping views as an inconvenient, more verbose API (which adds roughly as much boilerplate as Java Streams, so why not just give up and use Java Streams instead?)
  • understand that assigning labels about forcing/non-forcing is not meaningful for most use-cases and produces unnecessary complexity
  • separate between the composition of operations and its input, meaning we are ignoring all the lessons learned in Slick
  • offer any thought regarding macro-based implementations, meaning we are discarding the opportunity to improve on LINQ's crude approach

I'm not sure if it makes sense trying to salvage the request for strawmen, but I think

  • removing requirement (6), demanding instead that there is only a single API that works, instead of having an inefficient, convenient and an efficient, but inconvenient one
  • clarifying that (1) doesn't mean that every operation has to start executing implicitly
  • adding Table[T] (as a representation of a database table as an input source) and some RDD-like type (as a representation of a distributed data source)
could move things into a better direction.

Otherwise it feels we are just swapping some well-known disadvantages against other well-known disadvantages.

--
You received this message because you are subscribed to the Google Groups "scala-debate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Scope of redesigned collections

Simon Ochsenreither-3
In reply to this post by Haoyi Li
Hey Haoyi,

wouldn't this issue just disappear in a design where those general operations returned some kind of View[T]/Stream[T]/..., while the mutable, in-place operations returned the data-structure itself?
I didn't consider your requirements to be that important (it's certainly an annoying issue), but I think the approach I'm imagining would deal very well with your pain points.

Bye,

Simon

--
You received this message because you are subscribed to the Google Groups "scala-debate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Scope of redesigned collections

Matthew de Detrich
In reply to this post by Martin Odersky
Most pressing issues for me personally

1. Immutable/Mutable are not distinct enough. Like @lihaoyi said, they share too many common methods which you often do not want to use (i.e. you generally don't want to use an `append` on a immutable List). I am not sure how much this will break requirement #1 though. I guess my main personal gripe is its often confusing which methods you should use on mutable/immutable collections since they seem to be shared with eachother all over the place
2. Stuff like `.toSeq` and `toList` seems completely pointless when you have stuff like `to[List]`  and `to[Seq]`. The stdlib library in general however seems to have this all over the place. Would it make sense to provide a idiomatic (at least as far as stdlib is concerned) method for converting between different types, which is `.to[T]`?
3. Regarding `Arrays`, they are still incredibly important (so I question the method of not including them in our strawman) however, if immutable/mutable collections are split enough, then incorporating Arrays into the mutable collections shouldn't be too much of a problem

On Tuesday, October 6, 2015 at 7:18:38 AM UTC+11, Martin wrote:
I have recently opened an issue inviting strawman designs for new
collection architectures:

<a href="https://github.com/lampepfl/dotty/issues/818" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2Flampepfl%2Fdotty%2Fissues%2F818\46sa\75D\46sntz\0751\46usg\75AFQjCNGOFHWkF3kPkL2sbfn7D1oRdfo2jg&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2Flampepfl%2Fdotty%2Fissues%2F818\46sa\75D\46sntz\0751\46usg\75AFQjCNGOFHWkF3kPkL2sbfn7D1oRdfo2jg&#39;;return true;">https://github.com/lampepfl/dotty/issues/818

The issue lists some requirements for a design to be acceptable. Since
then there has been a discussion about several points that are all
outside the requirements outlined in #818. I think it's worthwhile to
discuss these, but do not think #818 is the proper place to do it.
That's why I would like to continue the discussion here.

 - Martin



--
Martin Odersky
EPFL

--
You received this message because you are subscribed to the Google Groups "scala-debate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Scope of redesigned collections

Rex Kerr-2
In reply to this post by Martin Odersky
I think we shouldn't do a relatively minor redesign now.  Instead, I think we should basically stay the course, which is to continue to clean up especially problematic cases that trip people up.  We've deprecated many of the weirdest and least-used collections, and can keep doing so to make the library cleaner and more consistent.

The reason I suggest to stay the course is that I don't think we can do a major redesign at this point.  Although there have been some advances in structuring collections operations, for the most part we don't yet have the tools to do dramatically better without also making some things dramatically worse.  There's no game-changer being proposed even in the most radical redesigns.

What _would_ make collections not just a bit better but dramatically better?  After all, as an end user they work pretty well as long as you don't need to extend them.

* First-class views with self-terminating scope (can be computed with ownership, but there are other ways):
    // Why is this even possible?
    val xs = whatever.iterator
    xs.take(xs.length / 2)    // Always empty?!

* Automatic AST fusion
  // You write this
  myVector.map(x => x*x).map(math.min(6,_)).sum
  // Scala generates this
  { var x = 0
    var vi = myVector.privateVectorIterator
    while (vi.hasNext) { val y = vi.next; x += math.min(6, y*y) }
    x
  }

 * SameType, a limited MyType without requiring exact types that captures the simple case of the builder pattern

  class Foo[A] { self: SelfType[A] =>
    def empty[B]: SelfType[B]
    def +(a: A): SelfType[A]
    def foreach(f: A => Unit): Unit
    def map[B](f: A => B): SelfType[B] = {
      var e = empty[B]
      foreach(x => e = e + f(x))
      e
    }
  }

  // Look, we extended Bar, and we can map!  No FooLike, no CBF!
  class Bar[A] private (private val only: Option[A]) extends Foo[A] {
    this() = new Bar[A](None)
    def empty[B] = new Bar[B](None)
    def +(a: A) = new Bar[A](Some(a))
    def foreach(f: A => Unit) { only.foreach(f) }
  }

For complex cases where you can't sensibly inherit from the parent _you don't inherit from the parent_.  When variance sneaks in to bite you, you simply punt and require a more complex pattern like CBF.

 * Specialization / miniboxing.  (If it's even necessary any more after AST fusion.)

 * Typeclass-invoked method groups to alert users to use cases.  (Probably needs implicit type parameters to work.)

val xs = mutable.Seq(1,2,3)
xs + 2   // Fails
xs.rebuild + 2  // rebuild "unlocks" all the immutable rebuilding methods

And I'm not entirely sure what extra safety would be possible without heroics with more advanced type lambdas.

All together this wouldn't be a little redesign, it'd be a _huge_ one, collections 2.0.

The idea would be to keep the old collections in maintenance mode approximately forever, gradually simplifying away the least used parts, until eventually everyone's just using the new one because it's way more awesome.

Anyway, I'm not sure if others like my particular wish list for collections, but I think for something as central as collections, messing with it a bit is not a winning proposition.  Any change will cause pain unless it's very minor, but without big and clear advantages, the pain will be hard to bear.

  --Rex




On Mon, Oct 5, 2015 at 1:18 PM, martin odersky <[hidden email]> wrote:
I have recently opened an issue inviting strawman designs for new
collection architectures:

https://github.com/lampepfl/dotty/issues/818

The issue lists some requirements for a design to be acceptable. Since
then there has been a discussion about several points that are all
outside the requirements outlined in #818. I think it's worthwhile to
discuss these, but do not think #818 is the proper place to do it.
That's why I would like to continue the discussion here.

 - Martin



--
Martin Odersky
EPFL

--
You received this message because you are subscribed to the Google Groups "scala-debate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "scala-debate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Scope of redesigned collections

Simon Ochsenreither-3
+1 to everything Rex said.

(Although I believe the things I suggested could count as game-changers, it's clear that the technology to pull it off is not there yet.)

--
You received this message because you are subscribed to the Google Groups "scala-debate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Scope of redesigned collections

Martin Odersky
In reply to this post by Aleksandar Prokopec


On Tue, Oct 6, 2015 at 1:29 AM, Aleksandar Prokopec <[hidden email]> wrote:


On Tuesday, 6 October 2015 00:34:14 UTC+2, Haoyi Li wrote:
Why do you say they are more ergonomic? In which ways?

Here's what you get when you want the methods on a python list:


Inline image 1

A small number of operations, basically all of which are useful to everyone who wants a mutable list. And all can be understood by anyone who has ever programmed before, without reading any docs! 


That's a bit unfair comparison, because it does not include Python's built-in functions, many of which are list-related:

https://docs.python.org/2/library/functions.html

That adds more functions to the table.
Plus, there's Python's itertools, which has more list-related methods:

https://docs.python.org/2/library/itertools.html
 
Here's what you get when you ask for the methods on a scala mutable buffer:

Inline image 2

How many of these are methods that someone can call on a mutable buffer without horrendous performance implications? Less than half. In fact, methods which look very similar (++ and ++=) have totally different semantics and performance implications. 

I'd like to challenge that. Buffer has 130+ methods defined on it. I don't think there are 65 methods among them that have horrendous performance implications.

Let's look at it in more detail, and things get fuzzy:

 - we have already established that arrays should get the full complement of immutable methods.
 - ResizableArray is like array, so why draw the boundary there?
 - ResizableArray is not used a lot because most people use ArrayBuffer to get a resizable array. So maybe ArrayBuffer needs to support these immutable operations as well?

In more detail:

 - Should ArrayBuffer support map, flatMap, filter? I believe the answer is yes, because otherwise we render for comprehensions over ArrayBuffers impossible.
 - If it supports filter, then why not also support partition and scan? They are just variants of filter, after all. 
 - What about ++? Why forbid concatenating two resizable arrays into one?
 - What about zip? It's clearly useful and it clearly constructs a new collection.

And so on. So what operations do we want to throw out? Probably the most "dubious" one are +, +:, :+, -, which copy the collection while adding or removing a single element. There are few situations where these make sense. And maybe we should indeed phase them out for mutable collections. I think, arguably, doing a + instead of += on a mutable collection counts as "exotic use" and could be deprecated. But note that these operations are even more often misused for immutable collections. xs :+ x on lists is one of the most common rookie mistakes in Scala. 

Cheers

 - Martin


 

A couple of remarks here:

1. There are many use-cases where one deals with small collections, comprised of several elements, where performance implications are less important.
2. There are use-cases where copying is less expensive than the actual useful work performed on the collection.
3. Non-strict map, flatMap, filter, ... could reduce or eliminate performance overhead due to copying.
4. The optimizer could fuse many of these operations, or eliminate copying, in particular those where a copying bulk operations is followed by a fold.
 
As an exercise, can you give me which methods on that list will result in O(n) copying of the whole collection, which is basically what you never want?


I'm not excluding that you might be right about "basically what you never want", but as you pointed out yourself - we would need numbers to make definite claims about this.
 
Here are a few that you don't want to ever call if you want mutable-seq performance

++
++:
+:
-
--
collect
diff
distinct
drop
dropRight
dropWhile
filter
filterNot
flatMap
groupBy
grouped


Here are a few that you *do* want

++=
+=
collectFirst
append
appendAll
fold{Left, Right}
forall
foreach
count
clear()

Here are a few which are just wtf:

<<(cmd: Message[Int])
+(other: String)
addString(b: StringBuilder)





On Mon, Oct 5, 2015 at 3:16 PM, Simon Ochsenreither <[hidden email]> wrote:
Given the scope, I would strongly suggest abandoning the collection redesign. It doesn't address the pain points we should worry about, and the restrictions mean it is at best yet-another temporary measure.

This feels like a great recipe for a Python 2/Python3 disaster: Too many small, subtle changes which will bite people with not enough associated benefit to encourage strong adoption.

If we really want to do something, I would probably prefer working on Josh's views, despite not being enamored about it: At least it's made clear that they are provisional band-aids.

But the scope just means that we will have the same discussion a few years down the road, and I prefer breaking people's code once, not twice.

--
You received this message because you are subscribed to the Google Groups "scala-debate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "scala-debate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.



--
Martin Odersky
EPFL

--
You received this message because you are subscribed to the Google Groups "scala-debate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
12345