Can Symbol be extended with a unique Int id?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Can Symbol be extended with a unique Int id?

Sebastien Diot
  Hello and Happy New Year!

I'm new to Scala and to this list. Hopefully, this is the right place
for my question. I bought the PDF of "Programming in Scala, 2nd
Edition",  and when I got to the part about Symbol literals, I still did
not know what to do with them after reading the example. The sentence:
"There is not much you can do with a symbol, except find out its name"
supported my impression that they were only of "academic interest", so I
searched the Internet and found inspiration: they could be used to
implement something like Prolog in Scala (and also another form of
"enums"). This got me going, but I see one major shortcoming: to do
useful things with them, you need to be able to attach arbitrary
information to them, to extend them, but since they are "interned", I
don't see how this is possible directly.

If Symbols had a unique Int id, which would be given automatically when
they are interned, along with a method that returns this ID from one
Symbol, and an "object" method that returns a Symbol from it's ID, you
could do all kind of interesting things, like storing them in a set
using a bit-set internally, or having your "extended attributes" stored
in a simple array, indexed by this ID. This would be very fast, O(1),
but without this ID you need to do the mapping yourself, probably using
a HashMap, leading to O(log N) access, where N is the number of Symbols
defined.

Is there a way to get such an ID without changing the Symbol class, and
without using a HashMap?

Also related to Symbols, but unrelated to the first question: If I write
'abc in a method, does the compiler create a new Symbol instance every
time and "resolve" it to the already interned instance, or is it clever
enough to create it only once while loading the Class, and just a
reference to it, which would be a lot more efficient?

--

MfG / Regards,
Sebastien Diot

Grossblittersdorferstrasse 257-259
66119 Saarbruecken
Germany
Tel.:  +49 (681) 8808-0
Fax.:  +49 (681) 8808-300
E-Mail.: [hidden email]

EURO DATA GmbH&  Co. KG
Datenverarbeitungsdienst
HR A 6448 Amtsgericht Saarbrücken
Komplementär: A. Reiß&  Sohn GmbH
HR B 4965 Amtsgericht Saarbrücken
Geschäftsführer: Dipl.-Kfm. Karl-Heinz Siebenpfeiffer


Reply | Threaded
Open this post in threaded view
|

Re: Can Symbol be extended with a unique Int id?

Alex Boisvert-2
Hi Sebastien,

Your argumentation is a little flaky.

You write: "to do useful things with them, you need to be able to attach arbitrary information to them, to extend them" and yet the Scala compiler seems to do lots of useful things with symbols and not need to attach information directly to them.

And then you state that HashMap access is O(log N) while it actually is O(1) already.

Have you read the following StackOverflow questions on symbols?

alex


On Sat, Jan 1, 2011 at 6:16 AM, Sebastien Diot <[hidden email]> wrote:
 Hello and Happy New Year!

I'm new to Scala and to this list. Hopefully, this is the right place for my question. I bought the PDF of "Programming in Scala, 2nd Edition",  and when I got to the part about Symbol literals, I still did not know what to do with them after reading the example. The sentence: "There is not much you can do with a symbol, except find out its name" supported my impression that they were only of "academic interest", so I searched the Internet and found inspiration: they could be used to implement something like Prolog in Scala (and also another form of "enums"). This got me going, but I see one major shortcoming: to do useful things with them, you need to be able to attach arbitrary information to them, to extend them, but since they are "interned", I don't see how this is possible directly.

If Symbols had a unique Int id, which would be given automatically when they are interned, along with a method that returns this ID from one Symbol, and an "object" method that returns a Symbol from it's ID, you could do all kind of interesting things, like storing them in a set using a bit-set internally, or having your "extended attributes" stored in a simple array, indexed by this ID. This would be very fast, O(1), but without this ID you need to do the mapping yourself, probably using a HashMap, leading to O(log N) access, where N is the number of Symbols defined.

Is there a way to get such an ID without changing the Symbol class, and without using a HashMap?

Also related to Symbols, but unrelated to the first question: If I write 'abc in a method, does the compiler create a new Symbol instance every time and "resolve" it to the already interned instance, or is it clever enough to create it only once while loading the Class, and just a reference to it, which would be a lot more efficient?

--

MfG / Regards,
Sebastien Diot

Grossblittersdorferstrasse 257-259
66119 Saarbruecken
Germany
Tel.:  +49 (681) 8808-0
Fax.:  +49 (681) 8808-300
E-Mail.: [hidden email]

EURO DATA GmbH&  Co. KG
Datenverarbeitungsdienst
HR A 6448 Amtsgericht Saarbrücken
Komplementär: A. Reiß&  Sohn GmbH
HR B 4965 Amtsgericht Saarbrücken
Geschäftsführer: Dipl.-Kfm. Karl-Heinz Siebenpfeiffer



Reply | Threaded
Open this post in threaded view
|

Re: Can Symbol be extended with a unique Int id?

Sebastien Diot
On 1.1.2011 16:31, Alex Boisvert wrote:
Hi Sebastien,

Your argumentation is a little flaky.

You write: " to do useful things with them, you need to be able to attach arbitrary information to them, to extend them" and yet the Scala compiler seems to do lots of useful things with symbols and not need to attach information directly to them.

And then you state that HashMap access is O(log N) while it actually is O(1) already.

I haven't finished reading the book yet; I will look at the compiler implementation later... But what I do know is that your counter argumentation is flaky too. I might not know much about Scala yet, but I do remember my algorithm courses. Hash maps are a standard algorithm; they are *always* O(log N). If they were not, they would not be hash maps. Even if all keys have different "natural" hashcodes, they could still fall in the same bucket after hashing, which is why hash maps cannot be O(1), if any language. Also, to be able to "intern" a Symbol based on the hashcode, the String hashcode has to be calculated, and that itself is O(size-of-String), not O(1), but presumably it's cached, making it a one time thing.

I'm not saying my idea would make much of a practical difference in most cases, unless one made a really heavy use of Symbols, but since the cost seems negligible to me (one more Int and Ref per Symbol), I thought it was worth considering.


Have you read the following StackOverflow questions on symbols?

Yes, I did see those; that is what brought me to the idea of Prolog. I wouldn't post without looking up in Google first.


alex


On Sat, Jan 1, 2011 at 6:16 AM, Sebastien Diot <[hidden email]> wrote:
 Hello and Happy New Year!

I'm new to Scala and to this list. Hopefully, this is the right place for my question. I bought the PDF of "Programming in Scala, 2nd Edition",  and when I got to the part about Symbol literals, I still did not know what to do with them after reading the example. The sentence: "There is not much you can do with a symbol, except find out its name" supported my impression that they were only of "academic interest", so I searched the Internet and found inspiration: they could be used to implement something like Prolog in Scala (and also another form of "enums"). This got me going, but I see one major shortcoming: to do useful things with them, you need to be able to attach arbitrary information to them, to extend them, but since they are "interned", I don't see how this is possible directly.

If Symbols had a unique Int id, which would be given automatically when they are interned, along with a method that returns this ID from one Symbol, and an "object" method that returns a Symbol from it's ID, you could do all kind of interesting things, like storing them in a set using a bit-set internally, or having your "extended attributes" stored in a simple array, indexed by this ID. This would be very fast, O(1), but without this ID you need to do the mapping yourself, probably using a HashMap, leading to O(log N) access, where N is the number of Symbols defined.

Is there a way to get such an ID without changing the Symbol class, and without using a HashMap?

Also related to Symbols, but unrelated to the first question: If I write 'abc in a method, does the compiler create a new Symbol instance every time and "resolve" it to the already interned instance, or is it clever enough to create it only once while loading the Class, and just a reference to it, which would be a lot more efficient?

-- 

MfG / Regards,
Sebastien Diot

Grossblittersdorferstrasse 257-259
66119 Saarbruecken
Germany
Tel.:  +49 (681) 8808-0           
Fax.:  +49 (681) 8808-300         
E-Mail.: [hidden email]     

EURO DATA GmbH & Co. KG
Datenverarbeitungsdienst
HR A 6448 Amtsgericht Saarbrücken
Komplementär: A. Reiß & Sohn GmbH
HR B 4965 Amtsgericht Saarbrücken
Geschäftsführer: Dipl.-Kfm. Karl-Heinz Siebenpfeiffer

Reply | Threaded
Open this post in threaded view
|

Re: Can Symbol be extended with a unique Int id?

Jim Balter-2
On Mon, Jan 3, 2011 at 12:06 AM, Sebastien Diot <[hidden email]> wrote:

> On 1.1.2011 16:31, Alex Boisvert wrote:
>
> Hi Sebastien,
> Your argumentation is a little flaky.
> You write: " to do useful things with them, you need to be able to attach
> arbitrary information to them, to extend them" and yet the Scala compiler
> seems to do lots of useful things with symbols and not need to attach
> information directly to them.
> And then you state that HashMap access is O(log N) while it actually is O(1)
> already.
>
> I haven't finished reading the book yet; I will look at the compiler
> implementation later... But what I do know is that your counter
> argumentation is flaky too. I might not know much about Scala yet, but I do
> remember my algorithm courses.

Human memory is notoriously fallible.

> Hash maps are a standard algorithm; they are
> *always* O(log N). If they were not, they would not be hash maps. Even if
> all keys have different "natural" hashcodes, they could still fall in the
> same bucket after hashing, which is why hash maps cannot be O(1), if any
> language.

Ahem. See http://en.wikipedia.org/wiki/Hash_table#Performance_analysis

> Also, to be able to "intern" a Symbol based on the hashcode, the
> String hashcode has to be calculated, and that itself is O(size-of-String),
> not O(1), but presumably it's cached, making it a one time thing.
>
> I'm not saying my idea would make much of a practical difference in most
> cases, unless one made a really heavy use of Symbols, but since the cost
> seems negligible to me (one more Int and Ref per Symbol), I thought it was
> worth considering.
>
>
> Have you read the following StackOverflow questions on symbols?
> http://stackoverflow.com/questions/780287/what-are-some-example-use-cases-for-symbol-literals-in-scala
> http://stackoverflow.com/questions/1324466/practical-examples-of-using-symbols-in-scala
>
> Yes, I did see those; that is what brought me to the idea of Prolog. I
> wouldn't post without looking up in Google first.
>
>
> alex
>
> On Sat, Jan 1, 2011 at 6:16 AM, Sebastien Diot <[hidden email]> wrote:
>>
>>  Hello and Happy New Year!
>>
>> I'm new to Scala and to this list. Hopefully, this is the right place for
>> my question. I bought the PDF of "Programming in Scala, 2nd Edition",  and
>> when I got to the part about Symbol literals, I still did not know what to
>> do with them after reading the example. The sentence: "There is not much you
>> can do with a symbol, except find out its name" supported my impression that
>> they were only of "academic interest", so I searched the Internet and found
>> inspiration: they could be used to implement something like Prolog in Scala
>> (and also another form of "enums"). This got me going, but I see one major
>> shortcoming: to do useful things with them, you need to be able to attach
>> arbitrary information to them, to extend them, but since they are
>> "interned", I don't see how this is possible directly.
>>
>> If Symbols had a unique Int id, which would be given automatically when
>> they are interned, along with a method that returns this ID from one Symbol,
>> and an "object" method that returns a Symbol from it's ID, you could do all
>> kind of interesting things, like storing them in a set using a bit-set
>> internally, or having your "extended attributes" stored in a simple array,
>> indexed by this ID. This would be very fast, O(1), but without this ID you
>> need to do the mapping yourself, probably using a HashMap, leading to O(log
>> N) access, where N is the number of Symbols defined.
>>
>> Is there a way to get such an ID without changing the Symbol class, and
>> without using a HashMap?
>>
>> Also related to Symbols, but unrelated to the first question: If I write
>> 'abc in a method, does the compiler create a new Symbol instance every time
>> and "resolve" it to the already interned instance, or is it clever enough to
>> create it only once while loading the Class, and just a reference to it,
>> which would be a lot more efficient?
>
> --
>
> MfG / Regards,
> Sebastien Diot
>
> Grossblittersdorferstrasse 257-259
> 66119 Saarbruecken
> Germany
> Tel.:  +49 (681) 8808-0
> Fax.:  +49 (681) 8808-300
> E-Mail.: [hidden email]
>
> EURO DATA GmbH & Co. KG
> Datenverarbeitungsdienst
> HR A 6448 Amtsgericht Saarbrücken
> Komplementär: A. Reiß & Sohn GmbH
> HR B 4965 Amtsgericht Saarbrücken
> Geschäftsführer: Dipl.-Kfm. Karl-Heinz Siebenpfeiffer
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Can Symbol be extended with a unique Int id?

Sebastien Diot
  On 3.1.2011 10:55, Jim Balter wrote:

> On Mon, Jan 3, 2011 at 12:06 AM, Sebastien Diot<[hidden email]>  wrote:
>> On 1.1.2011 16:31, Alex Boisvert wrote:
>>
>> Hi Sebastien,
>> Your argumentation is a little flaky.
>> You write: " to do useful things with them, you need to be able to attach
>> arbitrary information to them, to extend them" and yet the Scala compiler
>> seems to do lots of useful things with symbols and not need to attach
>> information directly to them.
>> And then you state that HashMap access is O(log N) while it actually is O(1)
>> already.
>>
>> I haven't finished reading the book yet; I will look at the compiler
>> implementation later... But what I do know is that your counter
>> argumentation is flaky too. I might not know much about Scala yet, but I do
>> remember my algorithm courses.
> Human memory is notoriously fallible.
>
>> Hash maps are a standard algorithm; they are
>> *always* O(log N). If they were not, they would not be hash maps. Even if
>> all keys have different "natural" hashcodes, they could still fall in the
>> same bucket after hashing, which is why hash maps cannot be O(1), if any
>> language.
> Ahem. See http://en.wikipedia.org/wiki/Hash_table#Performance_analysis

Hello.

I see; I was wrong, and I'm sorry about replying without even checking
if I was right. This is indeed embarrassing, and makes my argument
pointless. On the bright side, it means that using Symbols will still be
efficient if I use a lot of them without any API change. Thank you for
correcting me.

--

MfG / Regards,
Sebastien Diot

Grossblittersdorferstrasse 257-259
66119 Saarbruecken
Germany
Tel.:  +49 (681) 8808-0
Fax.:  +49 (681) 8808-300
E-Mail.: [hidden email]

EURO DATA GmbH&  Co. KG
Datenverarbeitungsdienst
HR A 6448 Amtsgericht Saarbrücken
Komplementär: A. Reiß&  Sohn GmbH
HR B 4965 Amtsgericht Saarbrücken
Geschäftsführer: Dipl.-Kfm. Karl-Heinz Siebenpfeiffer