Monday, 19 August 2013

java.lang.OutOfMemoryError: GC overhead limit

java.lang.OutOfMemoryError: GC overhead limit

I have a program that reads a large list of sequences from a file and does
a calculation among all of the pairs in that list. It then stores all of
these calculations into a hashset. When running this program about halfway
through, I get a GC overhead limit error.
I realize this is because the garbage collector is using up 98% of the
computation time and is unable to recover even 2% of the heap. Here is the
code I have:
ArrayList<String> c = loadSequences("file.txt"); // Loads 60 char DNA
sequences
HashSet<DNAPair,Double> LSA = new HashSet<DNAPair,Double>();
for(int i = 0; i < c.size(); i++) {
for(int j = i+1; j < c.size(); j++) {
LSA.put(new
DNAPair(c.get(i),c.get(j)),localSeqAlignmentSimilarity(c.get(i),c.get(j)));
}
}
And here's the code for the actual method:
public static double localSeqAlignmentSimilarity(String s1, String s2) {
s1 = " " + s1;
s2 = " " + s2;
int max = 0,h = 0,maxI = 0,maxJ = 0;
int[][] score = new int[61][61];
int[][] pointers = new int[61][61];
for(int i = 1; i < s1.length(); i++) {
pointers[i][0] = 2;
}
for(int i = 1; i < s2.length(); i++) {
pointers[0][i] = 1;
}
boolean inGap = false;
for(int i = 1; i < s1.length(); i++) {
for(int j = 1; j < s2.length(); j++) {
h = -99;
if(score[i-1][j-1] + match(s1.charAt(i),s2.charAt(j)) > h) {
h = score[i-1][j-1] + match(s1.charAt(i),s2.charAt(j));
pointers[i][j] = 3;
inGap = false;
}
if(!inGap) {
if(score[i-1][j] + GAPPENALTY > h) {
h = score[i-1][j] + GAPPENALTY;
pointers[i][j] = 2;
inGap = true;
}
if(score[i][j-1] + GAPPENALTY > h) {
h = score[i][j-1] + GAPPENALTY;
pointers[i][j] = 1;
inGap = true;
}
} else {
if(score[i-1][j] + GAPEXTENSION > h) {
h = score[i-1][j] + GAPEXTENSION;
pointers[i][j] = 2;
inGap = true;
}
if(score[i][j-1] + GAPEXTENSION > h) {
h = score[i][j-1] + GAPEXTENSION;
pointers[i][j] = 1;
inGap = true;
}
}
if(0 > h)
h = 0;
score[i][j] = h;
if(h >= max) {
max = h;
maxI = i;
maxJ = j;
}
}
}
double matches = 0;
String o1 = "", o2 = "";
while(!(maxI == 0 && maxJ == 0)) {
if(pointers[maxI][maxJ] == 3) {
o1 += s1.charAt(maxI);
o2 += s2.charAt(maxJ);
maxI--;
maxJ--;
} else if(pointers[maxI][maxJ] == 2) {
o1 += s1.charAt(maxI);
o2 += "_";
maxI--;
} else if(pointers[maxI][maxJ] == 1) {
o1 += "_";
o2 += s2.charAt(maxJ);
maxJ--;
}
}
StringBuilder a = new StringBuilder(o1);
b = new StringBuilder(o2);
o1 = a.reverse().toString();
o2 = b.reverse().toString();
a.setLength(0);
b.setLength(0);
for(int i = 0; i < Math.min(o1.length(), o2.length()); i++) {
if(o1.charAt(i) == o2.charAt(i)) matches++;
}
return matches/Math.min(o1.length(), o2.length());
}
I thought that this was because of all the variables I declare inside the
method (the two int arrays and the stringbuilders etc.) creating more and
more objects every time the method is run so I changed them all to static
fields and cleared them everytime (ex. Arrays.fill(score,0);) instead of
creating a new object.
However this didn't help at all and I still got the same error.
Could it be that the hashset that stores all of the calculations is
getting too big and is unable to be stored by java? I'm not getting an out
of heap space error so it seems kind of strange.
I also changed the command line argument to give more space to the JVM but
that didn't seem to help.
Any insight on this problem would be helpful. Thanks!

No comments:

Post a Comment