NavigableMap implementation bugs

View: New views
3 Messages — Rating Filter:   Alert me  
< Prev | 1 - 2 | Next >

Re: HashSet performance

by Kevin Bourrillion :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

One thing to consider is that if

(a) you have high-quality hashCode() functions, and
(b) you have fewer than 2^15 elements,

then you could trivially carve two independent functions out of one hash code.  Likewise, three if you have fewer than 2^10, etc.

Problem is, (a) is almost never the case, is it? :)



On Tue, Jun 10, 2008 at 11:22 AM, Martin Buchholz <martinrb@...> wrote:
On Tue, Jun 10, 2008 at 10:28 AM, Marcelo Fukushima <takeshi10@...> wrote:
> you can apply some function over the result of the hashCode, if the
> hashCode's of colliding entries are not identical

Good point.

The two hash functions won't be completely independent,
but this may be good enough in practice.  We know the
low order bits are colliding, so perhaps just shift them
away to the right, and use the remaining ones?

Martin
_______________________________________________
Concurrency-interest mailing list
Concurrency-interest@...
http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest



--
Kevin Bourrillion @ Google
internal: go/javalibraries
google-collections.googlecode.com
google-guice.googlecode.com

_______________________________________________
Concurrency-interest mailing list
Concurrency-interest@...
http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest

Re: HashSet performance

by Kevin Bourrillion :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Jun 10, 2008 at 2:38 PM, Martin Buchholz <martinrb@...> wrote:

On Tue, Jun 10, 2008 at 12:48 PM, Mark Thornton <mthornton@...> wrote:
> Much as I like optimisation, reusing the Map.Entry object was always a step
> too far for me.

I agree.  I believe only IdentityHashMap is evil in this way today.

Yep, I verified this fact a few weeks ago.  And I agree that it's evil; it causes some extremely bizarre situations.



--
Kevin Bourrillion @ Google
internal: go/javalibraries
google-collections.googlecode.com
google-guice.googlecode.com

_______________________________________________
Concurrency-interest mailing list
Concurrency-interest@...
http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest

Re: HashSet performance

by kohlerm :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Osvaldo,


On Tue, Jun 10, 2008 at 9:31 PM, Osvaldo Pinali Doederlein <osvaldo@...> wrote:

[snip]

Now, while we are playing with HashSet optimization, another idea is: we could have a method like get(), that looks for an object and if it's found, returns it. The Set interface only offers contains(), which returns a boolean - definitely a cleaner interface for the most common usage of Set, but then you can't use a Set for the VERY important use-case of canonicalization. Canonicalization would only require a Set, but when the canonic object is found, if want to return it, so the caller can use it in lieu of the original object. We are forced to use an explicit Map collection just because it offers the get() lookup method. A Set.get() method would be easily defined: public T get (T obj), checks if the set contains an object that's equal to obj, and returns that object (or null if not found). Now, we obviously cannot add new methods to the Set interface, but we could do that in concrete implementations like HashSet. Implementation of such method would be trivial (the existing contains() would me morphed into that new method - requiring only to change a return statement and the signature - and the new contains() would return "get(obj) != null").

This sound like very good idea to me. 

P.S.: Sun doesn't need a HashSet that allows efficient canonicalization because they cheat - String.intern() is a native method, probably implemented with a more efficient collection in the native layer. :)

It is implemented in the native layer, but it is not necessarily faster than HashMap or ConcurrentHashMap.
Actually it used to be much slower in JDK 1.4.2 because the JVM used a relatively small hashtable and if you interned enough Strings it would degenerate because it uses linked lists to resolve collisions.
"We" fixed this in the SAP JVM and I think newer Java version got better as well.

Still, ConcurrentHashMap can be faster in case you have parallel access by many threads.

The main advantage of String.intern() is that it acts like a WeakHashmap, but without the high memory overhead.

In the SAP JVM we therefore implemented an optimized String.intern(), which is fast enough for us for now, but it would be really nice to have an optimized ConcurrentWeakSet with support for the API you described above. 

Regards,
Markus


A+
Osvaldo
 
-----------------------------------------------------------------------
Osvaldo Pinali Doederlein                   Visionnaire Informática S/A
osvaldo@...                http://www.visionnaire.com.br
Arquiteto de Tecnologia                          +55 (41) 337-1000 #223

_______________________________________________
Concurrency-interest mailing list
Concurrency-interest@...
http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest



_______________________________________________
Concurrency-interest mailing list
Concurrency-interest@...
http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest
< Prev | 1 - 2 | Next >
LightInTheBox - Buy quality products at wholesale price!