Sagewire Logo

JAVA collection problem

6 Message(s) by 4 Author(s) originally posted in java machine


From: toton Date:   Monday, July 24, 2006
Hi everyone,
I need to have alarge array of small class 'es (in c# sense struct).
For premitive type s, as the ArrayList stores value types, hence it
is possible to use apache common primitives library to save memory &
cache overflow.
However, for array of classes, which are small enough, I want to
store them as value, so that they occupy contiguous memory space ,
rether than heving them reference d from list.
This is just to decrease cache overflow error & speedup things. Is
there a possible way to do it?.
In general, is there a possible way to locate the class in local stack ,
rather than heap or free store? Some common base class or weak
reference type? abir


From: Andrew Reilly Date:   Monday, July 24, 2006
wrote in message:
basically, no. and you're probably not going to get as much speed-up
or space savings as you might think. even if not allocated
contiguously, the copying garbage collector is likely to organize them
to be contiguous after the first GC. the extra indirect from the
array of point ers is pretty cheap.



That depends a *lot* on the type of object, IMO. Consider things like
complex numbers , or (dunno, I've less experience with these:) (x,y,z)
coordinate tuples. Compared to cache-prefetch-assisted in-order access
that you get from an embedded array, pointer dereferences are way
expensive.

if you *really* want to do something like this, you can allocate an
int[] and logically partition it in the code yourself, but this is VERY
nasty, break s the OO philosophy, etc.



Sometimes you just have to break it. I'm pretty sure that a good matrix
arithmetic class'd manage complex numbers the way Matlab does: not as
objects of type "complex", but as a complex-vector class that contains a
vector of floats (or doubles), and another possibly-null vector for the
imaginary part. You still get your OO paradigm, but only at scales larger
than vector. Scalars have to be done differently.

at least from the language perspective, all JAVA objects are
heap-allocated and accessed through a direct reference (pointer).



Eiffel (which pre-dates JAVA) has a nice way of handling this. It has
explicit syntax/key words for managing unboxed objects, in arrays, or as
class members. I imagine that it makes the compiler more complicated, but
it does let you keep your paradigm "all the way down" with good efficiency.

(Hmm: there are JAVA-bytecode back-ends for at least two Eiffel compilers
that I know about, but I have not used them. I wonder how they manage to
support this Eiffel language feature *and* be JVM-compatible?)

Cheers,

--
Andrew


From: toton Date:   Tuesday, July 25, 2006
wrote in message:
wrote in message:
> basically, no. and you're probably not going to get as much speed-up
> or space savings as you might think. even if not allocated
> contiguously, the copying garbage collector is likely to organize them
> to be contiguous after the first GC. the extra indirect from the
> array of pointers is pretty cheap.
That depends a *lot* on the type of object, IMO. Consider things like
complex numbers, or (dunno, I've less experience with these:) (x,y,z)
coordinate tuples. Compared to cache-prefetch-assisted in-order access
that you get from an embedded array, pointer dereferences are way
expensive.


The problem is, the objects are small, but plenty in number. u can
think them as 2D points ... and I've a fixed size array for them...
Now I access them sequentially, but may not create them sequentally.
Thus if I dont have all the points in nearby place in memory it not
only causes one extra indirection, but also cache overflow. The cost
comes high for computational geomery....
I event dont want the objects to be garbage collected. When I create a
new point, I want to put it it the array at the appropriate place (The
array behaves like a circular buffer here). Thus, no memory management
is at all needed for these large number of objects in practical sense.
(Want to do something like C++ placement new, which is used inside STl
vector class). Thus garbage collector will also get a relief rather
than managing large no of objects to e garbage collected....
Well, when a new Point comes, I really create a new point to put it
into a circular buffer,
rether than reintializing the field of the old point, which was already
there to a new value.
This is done, to keep in mind, I do computation on them in seperate
thread . Once I create a point, it is immutable, fields can not be
changed. Thus thread synchronization problems aren't there (which is
another costly thing to do).
Now the points come, the circular buffer stores them, discards the
old, and process in different thread.
Thus, it is better to have them in neaarby memory place, and not t get
garbage collected.
To me, it looks very much like C++ by value array, or C# struct.
The general method do work, but time goes high, ( I think JAVA do
outperform C++ even when it comes for looped processing... or hot
spots, and it has reason to do so). But this is a particular case
where I feel, stack allocation is better, and garbage collector should
not perform...
I've heard that mustang ( JAVA1.6) performs escape analysis to
allocate small objects in stack rether than heap, to speed up this kind
of feature... Do anyone know how it performs?

abir
> if you *really* want to do something like this, you can allocate an
> int[] and logically partition it in the code yourself, but this is VERY
> nasty, breaks the OO philosophy, etc.
Sometimes you just have to break it. I'm pretty sure that a good matrix
arithmetic class'd manage complex numbers the way Matlab does: not as
objects of type "complex", but as a complex-vector class that contains a
vector of floats (or doubles), and another possibly-null vector for the
imaginary part. You still get your OO paradigm, but only at scales larger
than vector. Scalars have to be done differently.
> at least from the language perspective, all JAVA objects are
> heap-allocated and accessed through a direct reference (pointer).
Eiffel (which pre-dates JAVA) has a nice way of handling this. It has
explicit syntax/key words for managing unboxed objects, in arrays, or as
class members. I imagine that it makes the compiler more complicated, but
it does let you keep your paradigm "all the way down" with good efficiency.
(Hmm: there are JAVA-bytecode back-ends for at least two Eiffel compilers
that I know about, but I have not used them. I wonder how they manage to
support this Eiffel language feature *and* be JVM-compatible?)
Cheers,
--
Andrew





From: peter d Date:   Tuesday, July 25, 2006
As toton put it so eloquently,
I need to have alarge array of small class'es (in c# sense struct).



i think you mean objects.

However, for array of classes, which are small enough, I want to
store them as value, so that they occupy contiguous memory space,
rether than heving them referenced from list.
This is just to decrease cache overflow error & speedup things. Is
there a possible way to do it?.



basically, no. and you're probably not going to get as much speed-up
or space savings as you might think. even if not allocated
contiguously, the copying garbage collector is likely to organize them
to be contiguous after the first GC. the extra indirect from the
array of pointers is pretty cheap.

if you *really* want to do something like this, you can allocate an
int[] and logically partition it in the code yourself, but this is
VERY nasty, breaks the OO philosophy, etc.

In general, is there a possible way to locate the class in local stack,
rather than heap or free store? Some common base class or weak
reference type?



same deal. the best to could do is create a couple variables with the
"fields" and manually inline the code in the method.

at least from the language perspective, all JAVA objects are
heap-allocated and accessed through a direct reference (pointer). I presume a virtual machine'd be allowed to break these rules in
cases it can prove equivalence, but I doubt this is used much if at
all. I believe VM implementers instead focus on making the memory
manager fast, fast, and fast.

--
Peter Dillinger
http://www.peterd.org


From: Chris Uppal Date:   Tuesday, July 25, 2006
wrote in message:

The problem is, the objects are small, but plenty in number. u can
think them as 2D points ... and I've a fixed size array for them...



I think that, if the performance you are seeing is /actually/ inadequate (which
I find quite plausible, but you have not posted anything quantative so I can not
tell), then you will be forced to give up on OO style programming for this.

If you want to represent a circular buffer of 2D points, then you could use a
single double[] array with the x and y co-ordinates at even and odd indexes.
Alternatively (and perhaps more feasibly if the data is more complicated than
just a pair of numbers), you could use a bunch of arrays: one for each "field".

You can give that a more OO flavour by creating objects which simply contain an
index into the array(s), and which implement their "fields" by accessing the
arrays. Needless to say, the more performance-critical parts of the code'd
not use those objects, but other parts of the code (presumably the bulk of it,
even if it is boring) could use them.

Other that that (which basically amounts to writting C in JAVA), I do not think
you've any option but to switch to a different language.

-- chris


From: peter d Date:   Tuesday, July 25, 2006
As Chris Uppal put it so eloquently,
Other that that (which basically amounts to writting C in JAVA), I do not think
you've any option but to switch to a different language.



as in JNI

--
Peter Dillinger
http://www.peterd.org



Next Message: Alternative to JVM


Blogs related to JAVA collection problem

Mapping collections (updated)
SOAP supports this natively. The problem with this mechanism however arises from the fact that collections in java are untyped. Thus, to support collections on Java 1.4, a little bit of extra work is required. ...

Why to Use Generics in JAVA
How many times have you faced a Java.lang.ClassCastException while handling Collections? Even if you are a good programmer then also you must have faced this problem sometimes or the other, and if you are not then of course many a times ...

Garbage collection DO exists in C#
Like Java it is able to remove objects that is not in use anymore without being ... This new garbage collection is a property named Tag in the Control class. ... The problem is that changing the model (Tag property) does not have any ...

Blogger's Block #4: Ruby and Java and Stuff
Java's collections library is decent, but not superb. It's nice having the data ... Anyway, Java's biggest failing, I've decided, is its lack of syntax for literal data objects. ... Don't ever let 'em tell you it doesn't cause problems. ...

Dspace Community-list; adding most popular collections entries hack
This way, you could then do a descending sort on the count to get the items with the most number of accesses — ie, probably your most used collections. The initial problem that I ran into was I wanted to use Java’s HashMap like a PHP ...

How We Solved our Garbage Collection Pausing Problem
Needless to say this was a huge performance problem. Pauses are caused by major garbage collections. Minor garbage collections do not cause pausing. ... java ... -XX:+DisableExplicitGC -XX:+UseConcMarkSweepGC -XX:NewSize=1200m ...


Programming | Sports | Autos

copyright 2006
Valid XHTML 1.0 Transitional