Académique Documents
Professionnel Documents
Culture Documents
Javarevisited
Blog about Java programming language, FIX Protocol, Tibco Rendezvous and related Java technology stack.
Best of Javarevisited
How SubString method works in Java - Memory Leak Fixed in JDK 1.7
Substring method from String class is one of most used method in Java, and it's also part of an interesting String interview
question e.g. How substring works in Java or sometime asked as how does substring creates memory leak in Java. In
order to answer these questions, you knowledge of implementation details is required. Recently one of my friend was
drilled on substring method in Java during a Java interview, he was using substring() method from long time, and of
course all of us has used this, but what surprises him was interviewer's obsession on Java substring, and deep dive till the
implementation level. Though String is a special class in Java, and subject of many interview questions e.g. Why char array
is better than String for storing password. In this case it was, substring method, which took center stage. Most of us rather
just use substring(..), and than forgot. Not every Java programmer go into code, and see how exactly it's working. To
Posts
Comments
Follow Us
Follow @javinpaul
Question starts with normal chit chat, and Interviewer ask, "Have you used substring method in Java", and my friend
proudly said Yes, lot many times, which brings a smile on interviewer's face. He says well, thats good. Next question was
Can you explain what does substring do? My friend got an opportunity to show off his talent, and how much he knows
about Java API; He said substring method is used to get parts of String in Java. Its defined in java.lang.String class,
and it's an overloaded method. One version of substring method takes just beginIndex, and returns part of String started
from beginIndex till end, while other takes two parameters, beginIndex and endIndex, and returns part of String
starting from beginIndex to endIndex-1. He also stressed that every time you call substring() method in Java, it will
return a new String because String is immutable in Java.
Next question was, what will happen if beginIndex is
equal to length in substring(int beginIndex), no it won't
throw IndexOutOfBoundException instead it will return
Followers
Subscribe by email:
By Javin Paul
Loading
Blog Archive
2014
(
76
)
2013
(
136
)
2012
(
217
)
2011
(
145
)
December
(
28
)
November
(
14
)
October
(
14
)
In case If you have still not figured it out, If the original string is very long, and has array of size 1GB, no matter how small a
substring is, it will hold 1GB array. This will also stop original string to be garbage collected, in case if doesn't have any live
reference. This is clear case of memory leak in Java, where memory is retained even if it's not required. That's how
substring method creates memory leak.
Obviously next question from interviewer would be, how do you deal with this problem? Though you can not go, and
change Java substring method, you can still make some work around, in case you are creating substring of significant
longer String. Simple solution is to trim the string, and keep size of character array according to length of substring. Luckily
java.lang.String has constructor to do this, as shown in below example.
How SubString method works in Java - Memory Leak Fixed in JDK 1.7
String apple = new String(
listOfStockSymbolsOnNYSE.substring(appleStartIndex, appleEndIndex)
);
If you look code on java.lang.String class, you will see that this constructor trim the array, if its bigger than String
itself.
September
(
22
)
August
(
11
)
July
(
7
)
June
(
9
)
May
(
6
)
Another way to solve this problem is to call intern() method on substring, which will than fetch an existing string from pool
April
(
10
)
or add it if necessary. Since the String in the pool is a real string it only take space as much it requires. Its also worth noting
that sub-strings are not internalized, when you call intern() method on original String. Most developer successfully
March
(
4
)
answers first three questions, which is related to usage of substring, but they get stuck on last two, How substring creates
memory leak or How substring works. It's not completely there fault, because what you know is that every time
substring() returns new String which is not exactly true, since its backed by same character array.
February
(
10
)
January
(
10
)
2010
(
33
)
This was the only interview question, which bothers my friend little otherwise, its standard service level company Java
interview in India. By the way, he got the call a day after ,even though he struggled little bit on How SubString method works
in Java, and that was the reason he shared this interview experience with me.
References
Posted by
Javin Paul
at
6:09 AM
Labels:
core java
,
core java interview question
Location:
United States
21 comments
:
Anonymous
said...
The code is returning a new String object which contains the same backing array. As the array is an object
reference, the two string objects will be pointing to the same backing array. No other instance of array is
created. So irrespective of number of substrings created from that string, memory will be used for only one
character array. The objects will be different but the backing array will be only one.
Anonymous
said...
I understand how substring works in Java but what is point of keeping original character array inside substring
object ? Since string is immutable you can not change a String once created and SubString are String.
Eclipse
jQuery
How SubString method works in Java - Memory Leak Fixed in JDK 1.7
Anonymous
said...
Hi,
I read your article and i cant able to understand completely. Why the empty string is returned, when begin
index and start index is same, i checked Java api i didn't get clear idea. Why they are doing like this. So i
started searching in internet. I found this post.
http://www.coderanch.com/t/564948/java/java/does-substring-method-works
Can you give more explanation on this one?
Apoorv
said...
Regarding the first comment related to "return ((beginIndex == 0) && (endIndex == count)) ? this : new
String(offset + beginIndex, endIndex - beginIndex, value)". This is correct observation, and in the constructor
used while creating a new string in above code of substring, only required array contents are copied and not
the entire array. So the assumption that creating a small substring of a 1GB sized String will occupy 1GB space
is wrong.
Anonymous
said...
@Apoorv: No array contents are copied. Several Strings can point at the same character array with different
lengths and offsets since this array will never change (which is the whole point of the String class being
immutable). What the new String will contain is a pointer to the 1 GB array with a different length and offset.
What I assume the interviewer was aiming towards was what happens when the original String gets garbaged
collected and there is only your tiny substring looking at 6 characters of this (unbelievably) large array.
I would hope that there is some internal mechanism that handles this in the JVM (there must be)! If I was asked
this question I'd focus on the massive design flaw of a system that relies on substring on a character array of
over a billion elements.
papajohn
said...
I was asked exactly this question in a Google telephone interview. Of course, I failed the last part which
assumed that I knew how substring is implemented internally.
When my reviewer mentioned this implementation, I told him that this is definitely implementation-dependant
and that other JVMs should behave differently.
In the end, I passed that interview.
Dave Conrad
said...
The String(String) constructor has changed between Java 6, 7, and (the early access version of) 8. That bit
about stripping out the baggage is no longer in there.
What you need to do, if your small sub-string of a large string is in variable "name", is:
name = new String(name.toCharArray());
Anonymous
said...
What I don't understand is how this 1G array is backed up by calling substring. The substring will call new
String(value, beginIndex, subLen) and inside this constructor the following code exists: this.value =
Arrays.copyOfRange(value, offset, offset+count). So our five characters are copied into a new array and the
original 1G is not backed up.
Yves Gillet
said...
Yea beware the implementation changed since java 7 at least, the substring no longer backs up the original
char[] array, it creates a new copy of it. Please reflect this in you article.
Yves Gillet
said...
what you are referring to the substring problem was in fact a bug:
http://bugs.sun.com/view_bug.do?bug_id=6294060
How SubString method works in Java - Memory Leak Fixed in JDK 1.7
I hope interviewers are at least aware that some of their questions are not valid anymore :)
@Yves Gillet, you are correct mate, It seems, substring method is now free from memory leak. Will update this
post. Thanks
Anonymous
said...
Is it really true? "no it won't throw Index OutOfBoundException instead it will return empty String, same is the
case when beginIndex and endIndex is same in case of second method."
Source - http://docs.oracle.com/javase/6/docs/api/java/lang/String.html#substring(int)
"
substring
public String substring(int beginIndex)
Returns a new string that is a substring of this string. The substring begins with the character at the specified
index and extends to the end of this string.
Examples:
"unhappy".substring(2) returns "happy"
"Harbison".substring(3) returns "bison"
"emptiness".substring(9) returns "" (an empty string)
Parameters:
beginIndex - the beginning index, inclusive.
Returns:
the specified substring.
Throws:
IndexOutOfBoundsException - if beginIndex is negative or larger than the length of this String object."
@Anonymous, What is your questions? isn't you answering your own question, if I understood correctly, in
"emptiness".substring(9) returns "" (an empty string)
Anonymous
said...
Anonymous
said...
@javin, can you please move the update note at the top of the article? I fear that readers who will only scan
this article will be misinformed.
- Michael
George
said...
Indeed substring in Java will not create any more memory leak from Java Java 1.7.0_06, issue has been fixed
by not allowing sharing of char array. They have even made some changes in java.lang.String class to remove
count and offset int variable, which can further save some memory, but this comes with a cost, by
compromising speed. Earlier substring was O(1) operation because of sharing same char array, but now it's
linear. I still think it's a good move to avoid memory leak, given excessive use of String and SubString method in
Java code.
Anonymous
said...
Hey Javin,
I've been reading your articles and although they are very helpful. This article does have much content on the
topic of sub-string and whatever it does have is quite confusing. Which actually goes for a lot of your articles.
How SubString method works in Java - Memory Leak Fixed in JDK 1.7
On reading this i was under the impression that a brand new string gets created when substring method is called
and because of memory leak, apart from the original string holding 1 gb of data, the substring too holds the
same size of 1 gb in memory which is wrong. When substring is called no new string that gets created and there
is only a reference to the old string.
Please explain the concepts more extensively and please don't take this in any bad way.
Thanks
Soumitra Pathak
said...
Hi,
while debugging we can see that even after substring it had all the content in the backing array.
so if we don't want to keep the same we can use
String new2 = new String(new1.toCharArray());
this will copy the actual array to the backing array.
I am not sure about what will happen to String 'new1' with data 1 GB.
Thanks,
S
Anonymous
said...
SubString will always create a new object, but in 6 it was still referring to the 1gb array, even after performing
the sub string. In 7 we create a copy of new char array to hold the substring and leave the huge array for
garbage collection if not needed.
Anonymous
said...
U have mentioned above that " This will also stop original string to be garbage collected, in case if doesn't have
any live reference.". can you please elaborate it more. I didn't understood it.
Anonymous
said...
Hey guys, this is Uma. Think this is not quite correct: -.what will happen if beginIndex is equal to length in
substring(int beginIndex),for the answer provided - empty string will be returned. Question ought to have been
what will happen if beginIndex is equal to length of string from which you want a piece of. That is what the
person who gave this example: "emptiness".substring(9) returns "" (an empty string) was trying to say.. Cheers!
Nice questions and answers.
Post a Comment
How SubString method works in Java - Memory Leak Fixed in JDK 1.7
Newer
Post
Home
Subscribe to:
Post Comments
(
Atom
)
About
Me
Privacy
Policy
Older
Post