tuple assignment optimization

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

tuple assignment optimization

Paul Phillips-3
Slightly long but reasonably important.  Ten second summary: I have
modified my compiler so that variable declarations in tuple style like

  val (x,y) = ("abc", 5)

generate bytecode like this:

  val x = "abc"
  val y = 5

instead of like this:

  val tmp = new Tuple2("abc", 5)
  val x = tmp._1
  val y = tmp._2

This has at least two benefits:

  1) There is no bonus tuple stored in every object
  2) Type errors become compile time errors instead of runtime errors.

To illustrate 2) consider this:

  val (x1, x2): (String, Int) = ("val", "def")

At present this has the double whammy of an annoying "unchecked" warning
and a nice ClassCastException at runtime.  But now:

a1.scala:8: error: type mismatch;
 found   : java.lang.String("def")
 required: Int
  val (x1, x2): (String, Int) = ("val", "def")
                                        ^
QUESTIONS: Does this have any hidden gotchas or can this be treated as a
pure optimization? Can more variations be optimized similarly or is this
the end of the line?

More details:

The general issues, well covered on the scala lists, are here:

  https://lampsvn.epfl.ch/trac/scala/ticket/140
  https://lampsvn.epfl.ch/trac/scala/ticket/1913

Given the following code:

  class t1 { val (x1, x2) = ("str", 5) }

As we know too well, currently the classfile includes:

  private final int x2;
  private final java.lang.String x1;
  private final scala.Tuple2 x$1;

...and such unnecessary fun as

   41: getfield #33; //Field x$1:Lscala/Tuple2;
   44: invokevirtual #44; //Method scala/Tuple2._2:()Ljava/lang/Object;
   47: invokestatic #48; //Method scala/runtime/BoxesRunTime.unboxToInt:(Ljava/lang/Object;)I
   50: putfield #50; //Field x2:I

In my local tree the given class now compiles to:

 private final int x2;
 private final java.lang.String x1;

And here is the complete constructor:

   0: aload_0
   1: invokespecial #14; //Method java/lang/Object."<init>":()V
   4: aload_0
   5: ldc #16; //String str
   7: putfield #20; //Field x1:Ljava/lang/String;
   10: aload_0
   11: iconst_5
   12: putfield #22; //Field x2:I
   15: return

This only happens if the right hand side of the assignment is a tuple
literal.  Nothing changes if it is a block or method call which evalutes
to a tuple, and I'm not sure much can be done in that case with this
general strategy, because the tuple will have to be stored somewhere for
at least long enough to extract the values.

--
Paul Phillips      | Every election is a sort of advance auction sale
Stickler           | of stolen goods.
Empiricist         |     -- H. L. Mencken
pp: i haul pills   |----------* http://www.improving.org/paulp/ *----------
Reply | Threaded
Open this post in threaded view
|

Re: tuple assignment optimization

Jan Lohre
Why would one want to write

val (x,y) = ("abc", 5)

instead of

 val x = "abc"
 val y = 5

directly?

Kind regards,
Jan

2009/5/14 Paul Phillips <[hidden email]>
Slightly long but reasonably important.  Ten second summary: I have
modified my compiler so that variable declarations in tuple style like

 val (x,y) = ("abc", 5)

generate bytecode like this:

 val x = "abc"
 val y = 5

instead of like this:

 val tmp = new Tuple2("abc", 5)
 val x = tmp._1
 val y = tmp._2

This has at least two benefits:

 1) There is no bonus tuple stored in every object
 2) Type errors become compile time errors instead of runtime errors.

To illustrate 2) consider this:

 val (x1, x2): (String, Int) = ("val", "def")

At present this has the double whammy of an annoying "unchecked" warning
and a nice ClassCastException at runtime.  But now:

a1.scala:8: error: type mismatch;
 found   : java.lang.String("def")
 required: Int
 val (x1, x2): (String, Int) = ("val", "def")
                                       ^
QUESTIONS: Does this have any hidden gotchas or can this be treated as a
pure optimization? Can more variations be optimized similarly or is this
the end of the line?

More details:

The general issues, well covered on the scala lists, are here:

 https://lampsvn.epfl.ch/trac/scala/ticket/140
 https://lampsvn.epfl.ch/trac/scala/ticket/1913

Given the following code:

 class t1 { val (x1, x2) = ("str", 5) }

As we know too well, currently the classfile includes:

 private final int x2;
 private final java.lang.String x1;
 private final scala.Tuple2 x$1;

...and such unnecessary fun as

  41:  getfield        #33; //Field x$1:Lscala/Tuple2;
  44:  invokevirtual   #44; //Method scala/Tuple2._2:()Ljava/lang/Object;
  47:  invokestatic    #48; //Method scala/runtime/BoxesRunTime.unboxToInt:(Ljava/lang/Object;)I
  50:  putfield        #50; //Field x2:I

In my local tree the given class now compiles to:

 private final int x2;
 private final java.lang.String x1;

And here is the complete constructor:

  0:   aload_0
  1:   invokespecial   #14; //Method java/lang/Object."<init>":()V
  4:   aload_0
  5:   ldc     #16; //String str
  7:   putfield        #20; //Field x1:Ljava/lang/String;
  10:  aload_0
  11:  iconst_5
  12:  putfield        #22; //Field x2:I
  15:  return

This only happens if the right hand side of the assignment is a tuple
literal.  Nothing changes if it is a block or method call which evalutes
to a tuple, and I'm not sure much can be done in that case with this
general strategy, because the tuple will have to be stored somewhere for
at least long enough to extract the values.

--
Paul Phillips      | Every election is a sort of advance auction sale
Stickler           | of stolen goods.
Empiricist         |     -- H. L. Mencken
pp: i haul pills   |----------* http://www.improving.org/paulp/ *----------

Reply | Threaded
Open this post in threaded view
|

Re: tuple assignment optimization

Alex Boisvert-3
One case I frequently encounter is to do something like,

val (head, tail) = list splitAt index

alex


On Thu, May 14, 2009 at 11:43 AM, Jan Lohre <[hidden email]> wrote:
Why would one want to write


val (x,y) = ("abc", 5)

instead of


 val x = "abc"
 val y = 5

directly?

Kind regards,
Jan

2009/5/14 Paul Phillips <[hidden email]>

Slightly long but reasonably important.  Ten second summary: I have
modified my compiler so that variable declarations in tuple style like

 val (x,y) = ("abc", 5)

generate bytecode like this:

 val x = "abc"
 val y = 5

instead of like this:

 val tmp = new Tuple2("abc", 5)
 val x = tmp._1
 val y = tmp._2

This has at least two benefits:

 1) There is no bonus tuple stored in every object
 2) Type errors become compile time errors instead of runtime errors.

To illustrate 2) consider this:

 val (x1, x2): (String, Int) = ("val", "def")

At present this has the double whammy of an annoying "unchecked" warning
and a nice ClassCastException at runtime.  But now:

a1.scala:8: error: type mismatch;
 found   : java.lang.String("def")
 required: Int
 val (x1, x2): (String, Int) = ("val", "def")
                                       ^
QUESTIONS: Does this have any hidden gotchas or can this be treated as a
pure optimization? Can more variations be optimized similarly or is this
the end of the line?

More details:

The general issues, well covered on the scala lists, are here:

 https://lampsvn.epfl.ch/trac/scala/ticket/140
 https://lampsvn.epfl.ch/trac/scala/ticket/1913

Given the following code:

 class t1 { val (x1, x2) = ("str", 5) }

As we know too well, currently the classfile includes:

 private final int x2;
 private final java.lang.String x1;
 private final scala.Tuple2 x$1;

...and such unnecessary fun as

  41:  getfield        #33; //Field x$1:Lscala/Tuple2;
  44:  invokevirtual   #44; //Method scala/Tuple2._2:()Ljava/lang/Object;
  47:  invokestatic    #48; //Method scala/runtime/BoxesRunTime.unboxToInt:(Ljava/lang/Object;)I
  50:  putfield        #50; //Field x2:I

In my local tree the given class now compiles to:

 private final int x2;
 private final java.lang.String x1;

And here is the complete constructor:

  0:   aload_0
  1:   invokespecial   #14; //Method java/lang/Object."<init>":()V
  4:   aload_0
  5:   ldc     #16; //String str
  7:   putfield        #20; //Field x1:Ljava/lang/String;
  10:  aload_0
  11:  iconst_5
  12:  putfield        #22; //Field x2:I
  15:  return

This only happens if the right hand side of the assignment is a tuple
literal.  Nothing changes if it is a block or method call which evalutes
to a tuple, and I'm not sure much can be done in that case with this
general strategy, because the tuple will have to be stored somewhere for
at least long enough to extract the values.

--
Paul Phillips      | Every election is a sort of advance auction sale
Stickler           | of stolen goods.
Empiricist         |     -- H. L. Mencken
pp: i haul pills   |----------* http://www.improving.org/paulp/ *----------


Reply | Threaded
Open this post in threaded view
|

Re: tuple assignment optimization

Kris Nuttycombe-3
In reply to this post by Jan Lohre
I've encountered the need to do this within the primary constructor
where more than one member val of the class was simultaneously
computed from the constructor arguments. I'd really like to see this
change make it into trunk if there are not significant gotchas.

Kris

On Thu, May 14, 2009 at 12:43 PM, Jan Lohre <[hidden email]> wrote:

> Why would one want to write
>
> val (x,y) = ("abc", 5)
>
> instead of
>
>  val x = "abc"
>  val y = 5
>
> directly?
>
> Kind regards,
> Jan
>
> 2009/5/14 Paul Phillips <[hidden email]>
>>
>> Slightly long but reasonably important.  Ten second summary: I have
>> modified my compiler so that variable declarations in tuple style like
>>
>>  val (x,y) = ("abc", 5)
>>
>> generate bytecode like this:
>>
>>  val x = "abc"
>>  val y = 5
>>
>> instead of like this:
>>
>>  val tmp = new Tuple2("abc", 5)
>>  val x = tmp._1
>>  val y = tmp._2
>>
>> This has at least two benefits:
>>
>>  1) There is no bonus tuple stored in every object
>>  2) Type errors become compile time errors instead of runtime errors.
>>
>> To illustrate 2) consider this:
>>
>>  val (x1, x2): (String, Int) = ("val", "def")
>>
>> At present this has the double whammy of an annoying "unchecked" warning
>> and a nice ClassCastException at runtime.  But now:
>>
>> a1.scala:8: error: type mismatch;
>>  found   : java.lang.String("def")
>>  required: Int
>>  val (x1, x2): (String, Int) = ("val", "def")
>>                                        ^
>> QUESTIONS: Does this have any hidden gotchas or can this be treated as a
>> pure optimization? Can more variations be optimized similarly or is this
>> the end of the line?
>>
>> More details:
>>
>> The general issues, well covered on the scala lists, are here:
>>
>>  https://lampsvn.epfl.ch/trac/scala/ticket/140
>>  https://lampsvn.epfl.ch/trac/scala/ticket/1913
>>
>> Given the following code:
>>
>>  class t1 { val (x1, x2) = ("str", 5) }
>>
>> As we know too well, currently the classfile includes:
>>
>>  private final int x2;
>>  private final java.lang.String x1;
>>  private final scala.Tuple2 x$1;
>>
>> ...and such unnecessary fun as
>>
>>   41:  getfield        #33; //Field x$1:Lscala/Tuple2;
>>   44:  invokevirtual   #44; //Method scala/Tuple2._2:()Ljava/lang/Object;
>>   47:  invokestatic    #48; //Method
>> scala/runtime/BoxesRunTime.unboxToInt:(Ljava/lang/Object;)I
>>   50:  putfield        #50; //Field x2:I
>>
>> In my local tree the given class now compiles to:
>>
>>  private final int x2;
>>  private final java.lang.String x1;
>>
>> And here is the complete constructor:
>>
>>   0:   aload_0
>>   1:   invokespecial   #14; //Method java/lang/Object."<init>":()V
>>   4:   aload_0
>>   5:   ldc     #16; //String str
>>   7:   putfield        #20; //Field x1:Ljava/lang/String;
>>   10:  aload_0
>>   11:  iconst_5
>>   12:  putfield        #22; //Field x2:I
>>   15:  return
>>
>> This only happens if the right hand side of the assignment is a tuple
>> literal.  Nothing changes if it is a block or method call which evalutes
>> to a tuple, and I'm not sure much can be done in that case with this
>> general strategy, because the tuple will have to be stored somewhere for
>> at least long enough to extract the values.
>>
>> --
>> Paul Phillips      | Every election is a sort of advance auction sale
>> Stickler           | of stolen goods.
>> Empiricist         |     -- H. L. Mencken
>> pp: i haul pills   |----------* http://www.improving.org/paulp/
>> *----------
>
>
Reply | Threaded
Open this post in threaded view
|

Re: tuple assignment optimization

Colin Bullock
In addition to the constructor use case already mentioned, I find myself using tuple literals for multiple intializations such as:

val (h, t) = (list.head, list.tail)

Though this could certainly be broken into multiple lines, it would be nice to not pay the price of an extra tuple instance when written this way.

- Colin
Reply | Threaded
Open this post in threaded view
|

Re: tuple assignment optimization

Paul Phillips-3
On Thu, May 14, 2009 at 02:11:19PM -0500, Colin Bullock wrote:
> In addition to the constructor use case already mentioned, I find myself
> using tuple literals for multiple intializations such as:
>
> val (h, t) = (list.head, list.tail)

I'm glad someone used an example which would actually be affected
here... because unfortunately none of the previous ones would.

Will be optimized:

  val (x, y) = (anything, anything)

Will not be optimized:

  val (x, y) = list splitAt index
  val (x, y) = { foo ; bar; (p, q) }
  val (x, y) = z match { case (p1,p2) => (p1,p2) }

--
Paul Phillips      | A national political campaign is better than the
Future Perfect     | best circus ever heard of, with a mass baptism and
Empiricist         | a couple of hangings thrown in.
pal, i pill push   |     -- H. L. Mencken
Reply | Threaded
Open this post in threaded view
|

Re: tuple assignment optimization

Martin Odersky
The problem is less the optimization than the spec. Do we want to
introduce special wording for this?

For that reason, I am OK with the optimization, but less OK with the
static type error.

Cheers

 -- Martin
Reply | Threaded
Open this post in threaded view
|

Re: tuple assignment optimization

Paul Phillips-3
On Thu, May 14, 2009 at 10:00:03PM +0200, martin odersky wrote:
> The problem is less the optimization than the spec. Do we want to
> introduce special wording for this?

Indeed, I thought about this but had no good answer.  The spec issue
would be easier with a more comprehensive solution, but then the
implementation becomes the hard part.

> For that reason, I am OK with the optimization, but less OK with the
> static type error.

Mmmm, that would be kind of sad to give up.  That's in fact the reason I
undertook it.

If I'm just tackling the storage, I suspect this is the wrong approach;
the synthetic tuple should just be marked local and not turned into a
field.  I'm trying that approach now but it's harder than I thought, not
inherently but because of the assumptions in place when lots of other
code was written.  On the plus side it'd solve the storage problem for
every case, not just the simple one.

--
Paul Phillips      | Every normal man must be tempted at times
Imperfectionist    | to spit on his hands, hoist the black flag,
Empiricist         | and begin to slit throats.
ha! spill, pupil   |     -- H. L. Mencken
Reply | Threaded
Open this post in threaded view
|

Re: tuple assignment optimization

Martin Odersky
On Thu, May 14, 2009 at 10:35 PM, Paul Phillips <[hidden email]> wrote:

> On Thu, May 14, 2009 at 10:00:03PM +0200, martin odersky wrote:
>> The problem is less the optimization than the spec. Do we want to
>> introduce special wording for this?
>
> Indeed, I thought about this but had no good answer.  The spec issue
> would be easier with a more comprehensive solution, but then the
> implementation becomes the hard part.
>
>> For that reason, I am OK with the optimization, but less OK with the
>> static type error.
>
> Mmmm, that would be kind of sad to give up.  That's in fact the reason I
> undertook it.
>
> If I'm just tackling the storage, I suspect this is the wrong approach;
> the synthetic tuple should just be marked local and not turned into a
> field.  I'm trying that approach now but it's harder than I thought, not
> inherently but because of the assumptions in place when lots of other
> code was written.  On the plus side it'd solve the storage problem for
> every case, not just the simple one.

Yes, I think that's the right approach. I'd do it rather late in the
compilation pipeline, in phase constructors.

Cheers

 -- Martin