yeom [Fri, 8 Apr 2011 03:45:29 +0000 (03:45 +0000)]
fixes -nostalltr flag: if it is valid to prune heap examiners, turning off both of task & stallsite examiners
jzhou [Fri, 8 Apr 2011 03:42:44 +0000 (03:42 +0000)]
Enable 1,2,4,8,16,32,50 cores' execution
jjenista [Thu, 7 Apr 2011 02:33:53 +0000 (02:33 +0000)]
TODO: move record pool to a thread-local thing and never deallocate it. Issue is that a child can retire, then its parent, then deallocate the child's entire mem pool... then the child's traverser finally gets around to the record
jjenista [Thu, 7 Apr 2011 02:32:00 +0000 (02:32 +0000)]
bug fix, at task exit we are looking for sources that cause virtual read of soemthing the task wrote, don't just look at parent and siblings, but also the parent's parent and the parent's sibling, etc.
yeom [Thu, 7 Apr 2011 02:24:50 +0000 (02:24 +0000)]
change on the labyrinth run script: making memory interleaved on all cpus as like other runscripts
bdemsky [Thu, 7 Apr 2011 01:21:33 +0000 (01:21 +0000)]
hack seems to help a little...power isn't going to have good performance...
bdemsky [Wed, 6 Apr 2011 23:33:51 +0000 (23:33 +0000)]
D2 bug
bdemsky [Wed, 6 Apr 2011 21:55:59 +0000 (21:55 +0000)]
fix SOR bug
bdemsky [Wed, 6 Apr 2011 21:15:22 +0000 (21:15 +0000)]
bug and remove debug code
bdemsky [Wed, 6 Apr 2011 20:20:12 +0000 (20:20 +0000)]
get back some of the speed we are losing from bug fixes...
bdemsky [Wed, 6 Apr 2011 20:01:46 +0000 (20:01 +0000)]
bug fix...wasn't adding out of context nodes that get summarized to set of ndoes that exist in context...
jjenista [Wed, 6 Apr 2011 19:41:47 +0000 (19:41 +0000)]
change benchmark to run once, not up to seven times
jjenista [Wed, 6 Apr 2011 19:39:10 +0000 (19:39 +0000)]
alter benchmark to run once, not up to seven times
jjenista [Wed, 6 Apr 2011 19:15:39 +0000 (19:15 +0000)]
get debug statements out of single version
jzhou [Wed, 6 Apr 2011 16:50:39 +0000 (16:50 +0000)]
Changes
bdemsky [Wed, 6 Apr 2011 08:11:57 +0000 (08:11 +0000)]
bug fix...things are slower to compile...:(
stephey [Wed, 6 Apr 2011 07:33:22 +0000 (07:33 +0000)]
bug fix for refCount in SMFEStates. I'm not sure if this fix covers all cases (it seems like it should to me)...
bdemsky [Wed, 6 Apr 2011 03:31:46 +0000 (03:31 +0000)]
bug fix
bdemsky [Wed, 6 Apr 2011 03:24:48 +0000 (03:24 +0000)]
bug fix and other change
bdemsky [Wed, 6 Apr 2011 03:22:55 +0000 (03:22 +0000)]
checked in new version of bm
jjenista [Wed, 6 Apr 2011 01:18:17 +0000 (01:18 +0000)]
bug fix, have to dereference this pointer
yeom [Wed, 6 Apr 2011 00:49:05 +0000 (00:49 +0000)]
changes.
yeom [Wed, 6 Apr 2011 00:39:23 +0000 (00:39 +0000)]
changes: handle the case that there is only one work item after enqueue, then it reports nothing and makes the worker steal from other queues.
bdemsky [Tue, 5 Apr 2011 23:03:42 +0000 (23:03 +0000)]
bug fix...code that was never exercised...
jjenista [Tue, 5 Apr 2011 23:02:07 +0000 (23:02 +0000)]
static initializers cause a problem for DFJ, easy enough to work around them
jjenista [Tue, 5 Apr 2011 21:50:53 +0000 (21:50 +0000)]
version of delaunay that has a spin bug
jjenista [Tue, 5 Apr 2011 21:50:25 +0000 (21:50 +0000)]
get this debug print out of here
bdemsky [Tue, 5 Apr 2011 20:45:02 +0000 (20:45 +0000)]
gc bug fix
bdemsky [Tue, 5 Apr 2011 20:44:49 +0000 (20:44 +0000)]
bug fix...
yeom [Tue, 5 Apr 2011 20:26:12 +0000 (20:26 +0000)]
updates.
jjenista [Tue, 5 Apr 2011 17:40:14 +0000 (17:40 +0000)]
at task exit, a task should acquire any out-set variables that arent already it the task record. These could be from static or dynamic sources, so some extra analysis info should be saved to generate the correct copy statements. make sure static and dynamic tracking variables are generated too
jjenista [Mon, 4 Apr 2011 20:43:53 +0000 (20:43 +0000)]
speculative version of delaunay works single-threaded just fine
yeom [Mon, 4 Apr 2011 18:34:18 +0000 (18:34 +0000)]
changes on run scripts
jjenista [Mon, 4 Apr 2011 18:29:21 +0000 (18:29 +0000)]
dfj version of delaunay refinement executes the original algorithm with workers=1, but set to 2 another malformed triangle appears...
bdemsky [Mon, 4 Apr 2011 01:41:40 +0000 (01:41 +0000)]
#ifdef this out if not threads
yeom [Mon, 4 Apr 2011 01:14:06 +0000 (01:14 +0000)]
bug fix on the oooJava queue: when tries to add a vector item, get rid of finished item in front of it.
jjenista [Mon, 4 Apr 2011 00:19:05 +0000 (00:19 +0000)]
keep a copy of the original delaunay refinement algorithm for comparison, and dfj version is crashing on nodes (triangles) with zero neighbors, but original configured graphs are the same size...
bdemsky [Mon, 4 Apr 2011 00:06:45 +0000 (00:06 +0000)]
lock bug for gc...jin, make sure tilera gc doesn't have same bug...need to update thread lock pointer to locks at end of gc
bdemsky [Sun, 3 Apr 2011 23:43:43 +0000 (23:43 +0000)]
bug fixes
bdemsky [Sun, 3 Apr 2011 23:21:01 +0000 (23:21 +0000)]
bug...
bdemsky [Sun, 3 Apr 2011 21:52:03 +0000 (21:52 +0000)]
ugle hacks to get around compiler bugss...
bdemsky [Sun, 3 Apr 2011 21:41:10 +0000 (21:41 +0000)]
bugs in buildflat...break optimization phase
yeom [Sun, 3 Apr 2011 21:13:35 +0000 (21:13 +0000)]
changes: build conflict graphs that have line number information of stall sites
bdemsky [Sun, 3 Apr 2011 20:29:39 +0000 (20:29 +0000)]
similar bugs to rcr
jjenista [Sun, 3 Apr 2011 18:52:23 +0000 (18:52 +0000)]
reworking parallel implementation, both single and DFJ runs build a zero-node configuration right now, whoops
jjenista [Sun, 3 Apr 2011 18:51:44 +0000 (18:51 +0000)]
Add a warning when a method call does not resolve to a defined method
bdemsky [Sun, 3 Apr 2011 08:56:04 +0000 (08:56 +0000)]
changes
bdemsky [Sun, 3 Apr 2011 06:23:47 +0000 (06:23 +0000)]
changes to get pass specJBB validation
bdemsky [Sun, 3 Apr 2011 05:46:53 +0000 (05:46 +0000)]
specjbb build on intel now...
bdemsky [Sun, 3 Apr 2011 05:38:59 +0000 (05:38 +0000)]
add wait/notify
bdemsky [Sun, 3 Apr 2011 05:26:46 +0000 (05:26 +0000)]
change
bdemsky [Sun, 3 Apr 2011 04:23:36 +0000 (04:23 +0000)]
change
bdemsky [Sun, 3 Apr 2011 04:22:56 +0000 (04:22 +0000)]
add ieeeremainder
bdemsky [Sun, 3 Apr 2011 04:21:45 +0000 (04:21 +0000)]
make this work with MGC
bdemsky [Sun, 3 Apr 2011 04:12:23 +0000 (04:12 +0000)]
fix array
bdemsky [Sun, 3 Apr 2011 04:09:24 +0000 (04:09 +0000)]
small bug
bdemsky [Sun, 3 Apr 2011 03:56:30 +0000 (03:56 +0000)]
add enwmethos
bdemsky [Sun, 3 Apr 2011 03:56:14 +0000 (03:56 +0000)]
changes for correctness
bdemsky [Sun, 3 Apr 2011 03:17:22 +0000 (03:17 +0000)]
trying to build specjbb on intel
bdemsky [Sun, 3 Apr 2011 03:17:05 +0000 (03:17 +0000)]
changes
bdemsky [Sat, 2 Apr 2011 23:26:43 +0000 (23:26 +0000)]
fixed benchmark
bdemsky [Sat, 2 Apr 2011 22:53:33 +0000 (22:53 +0000)]
really, really nasty bug...see page 8-9 of vol 3A of intel processor manual for x86 memory reordering...
bdemsky [Sat, 2 Apr 2011 22:49:40 +0000 (22:49 +0000)]
new information...x86 allows reads to be reordered with earlier writes to different locations....
add mbarrier for situations where we can't allow this to happen...
bdemsky [Sat, 2 Apr 2011 21:03:59 +0000 (21:03 +0000)]
missing barrier
bdemsky [Sat, 2 Apr 2011 20:22:58 +0000 (20:22 +0000)]
bug fix
jjenista [Sat, 2 Apr 2011 15:12:31 +0000 (15:12 +0000)]
stuff for running set up
jjenista [Sat, 2 Apr 2011 15:06:07 +0000 (15:06 +0000)]
modified algorithm for dfj style parallelism
bdemsky [Sat, 2 Apr 2011 04:05:53 +0000 (04:05 +0000)]
static variables now actually work...we don't want static variables here i think though...
bdemsky [Sat, 2 Apr 2011 04:02:42 +0000 (04:02 +0000)]
add RCR to the global gc initializer...bad if we have globals though
jzhou [Sat, 2 Apr 2011 03:39:06 +0000 (03:39 +0000)]
Bug fixes and add some code for easy debugging
bdemsky [Sat, 2 Apr 2011 03:38:01 +0000 (03:38 +0000)]
bug fix for yeom...can you try to see if power works now?
bdemsky [Fri, 1 Apr 2011 21:52:15 +0000 (21:52 +0000)]
build pruned graphs that encode conflict information...
double periphery nodes may contain conflicts...conflict effects are put in brackets...
edges that have conflicts are dashed
stephey [Fri, 1 Apr 2011 19:55:05 +0000 (19:55 +0000)]
These are the files that were giving me trouble last night. The single version runs fine, the rcr version errors out (either it's a RCR build problem, an RCR runtime error, or it screws with the data of the actual program and causes it to error out at Cavity.build...). This may indicate that there's something wrong with our runtime...
bdemsky [Fri, 1 Apr 2011 02:10:17 +0000 (02:10 +0000)]
make hashCode a native method for java lang Object...
bdemsky [Fri, 1 Apr 2011 02:03:53 +0000 (02:03 +0000)]
move hashCode method to native method
bdemsky [Fri, 1 Apr 2011 00:20:42 +0000 (00:20 +0000)]
bug fix for stephen
yeom [Thu, 31 Mar 2011 22:53:12 +0000 (22:53 +0000)]
only print out line numbers in debug mode
yeom [Thu, 31 Mar 2011 21:45:51 +0000 (21:45 +0000)]
bring last changes before executing benchmakrs
yeom [Thu, 31 Mar 2011 19:08:32 +0000 (19:08 +0000)]
changes: reorganizes debug messages
yeom [Thu, 31 Mar 2011 18:43:42 +0000 (18:43 +0000)]
add a new compiler flag -nolock: turning off synchronization lock
yeom [Thu, 31 Mar 2011 17:46:41 +0000 (17:46 +0000)]
changes for better debug messages
yeom [Thu, 31 Mar 2011 17:26:22 +0000 (17:26 +0000)]
changes.
stephey [Thu, 31 Mar 2011 09:52:47 +0000 (09:52 +0000)]
Added in semi-trivial locations (to test it). However, can't confirm results because the old make rcr takes a while (I didn't let it finish, but it was over 5 mins...) and the new one crashes with a NULLPOINTEREXCEPTION
stephey [Thu, 31 Mar 2011 09:04:04 +0000 (09:04 +0000)]
Tried to squeeze out more performance by changing the LinkedLIsts in the Delaunay port to vectors (which is closer to the original implementation of ArrayLists). Seems not to make an appreciable difference though...
Added Vector.clone() and VectorIterator
jzhou [Thu, 31 Mar 2011 03:46:19 +0000 (03:46 +0000)]
Bug fix: should use unsigned int instead of int for pointers in shared heap. Also as we now use a big share array to hold mapping information, do not need to allocate extra memory for master core's local heap.
yeom [Thu, 31 Mar 2011 03:00:34 +0000 (03:00 +0000)]
changes: only apply the variable analysis on the method containing TASK
yeom [Thu, 31 Mar 2011 02:52:15 +0000 (02:52 +0000)]
change
stephey [Thu, 31 Mar 2011 02:28:21 +0000 (02:28 +0000)]
Added manual invocation of garbage collector via System.gc()
stephey [Thu, 31 Mar 2011 02:27:46 +0000 (02:27 +0000)]
Optimized HashMap/HashSet and added System.gc() for manual garbage collection invokation
stephey [Thu, 31 Mar 2011 02:25:53 +0000 (02:25 +0000)]
Changes/optimizations to benchmark.
yeom [Thu, 31 Mar 2011 01:26:31 +0000 (01:26 +0000)]
change due to changes of math class library
yeom [Wed, 30 Mar 2011 23:12:40 +0000 (23:12 +0000)]
bug fixes on OoOJava, now it works fine with all of benchmarks(but, Kmeans has lower speedup 10.6x than 13.8x. still working on...)
-changes on the potential stall site analysis: propagating the status of callee's return node to the caller region and when current node has a status change, making following nodes updated to get a new potential stall site status.
-changes on liveness analysis of OoOJava: new analysis only covers the task region, not whole region of the flat method.
-changes on disjoint analysis: rather than using reachgraph's inAccessibleSet, using the result of brian's new accessible analysis.
stephey [Wed, 30 Mar 2011 06:22:47 +0000 (06:22 +0000)]
The benchmark works as far as I can tell (passes internal tests). We just need to put in sese blocks now if that's the case.
... There are things preventing me from doing a precise check.
1) It appears that given the same input and everything, the while loop will inevitably run a different number of times for each run (this occurs in both the original and ported code) and pick a different order of triangles to process.
2) The differences in how we and Java handle collection/sets gives a different ordering of which triangles we process (which affects which triangles become bad at the end).
Aside from that, there seems to be some performance issues.
1) It seems that the original code benefits from branch prediction as successive, duplicate runs (inherent in the benchmark) yields faster and faster timings. Our run-time stays the same in successive runs. This is true even though the number of triangles processed stays relatively close to each other.
2) When processing the large.2 files (about 10k triangles), the difference between our first runtime and the original code's runtime is within 100ms of each other (out of roughly 1000ms), which sounds reasonable (note that on successive runs, the original code speeds up by a factor of 1.6). However when we process the massive.2 files (roughly 100k triangles), our runtimes are double the original code's runtime (but successive runs of the original code yields a speedup of only 1.12. Whatever speedup it got on the large.2 file is amortized). Also, our ported code garbage collects and then crashes (out of memory) on the second run (there are 3 repeated runs).
This is as much as I can do at the moment, going to need some guidance on the performance issues mentioned above and also where to place the sese blocks.
stephey [Wed, 30 Mar 2011 04:11:15 +0000 (04:11 +0000)]
It compiles and runs now... but it doesn't appear to be doing the right thing... It passes internal verification but appears to be taking fewer iterations than the original program...
stephey [Tue, 29 Mar 2011 23:37:34 +0000 (23:37 +0000)]
Compiles with a few warnings... Doesn't run yet though.
stephey [Tue, 29 Mar 2011 22:54:40 +0000 (22:54 +0000)]
Added an error case
stephey [Tue, 29 Mar 2011 19:05:43 +0000 (19:05 +0000)]
Shouldn't have been checked in the first place... accident.
jzhou [Tue, 29 Mar 2011 16:44:14 +0000 (16:44 +0000)]
Remove the local mapping tbl and shared mapping tbl in gc, instead, use a big shared array to hold the mapping information of moved objs
stephey [Tue, 29 Mar 2011 08:14:26 +0000 (08:14 +0000)]
Closer to compiling. at the moment, it throws a NullPointerException right after the output Build class:LinkedList. Not sure what's going on there.
bdemsky [Tue, 29 Mar 2011 07:37:15 +0000 (07:37 +0000)]
clean up math.java a little...standardize things to the actual Java class library...plus i didn't like having more ifs than needed