String in Java is very special class and most frequently used class as
well. There are lot many things to learn about String in Java
than any other class, and having a good knowledge of different String
functionalities makes you to use it properly. Given heavy use of Java String in almost
any kind of project, it become even more important to know subtle detail about
String. Though I have shared lot of String related article already here in
Javarevisited, this is an effort to bring some of String feature together. In
this tutorial we will see some important points about Java String, which is
worth remembering. You can also refer my earlier post 10
advanced Java String questions to know more about String. Though I
tried to cover lot of things, there are definitely few things, which I might
have missed; please let me know if you have any question or doubt on java.lang.String
functionality and I will try to address them here.
1)
Strings are not null terminated in Java.
Unlike C and C++, String in Java doesn't terminate with null character. Instead
String are Object in Java and backed by character array. You can get the
character array used to represent String in Java by calling toCharArray() method of java.lang.String class of
JDK.
2)
Strings are immutable and final in Java
Strings are immutable in Java it means once created you cannot modify
content of String. If you modify it by using toLowerCase(), toUpperCase() or any
other method, It always result in new
String. Since String is final there is no way anyone can extend String or
override any of String functionality. Now if you are puzzled why
String is immutable or final in Java. checkout the link.
3)
Strings are maintained in String Pool
As I Said earlier String is special class in Java and all String literal
e.g. "abc" (anything
which is inside double quotes are String literal in Java) are maintained in a
separate String pool, special memory location inside Java memory, more
precisely inside PermGen
Space. Any time you create a new String object using String literal, JVM
first checks String pool and if an object with similar content available, than
it returns that and doesn't create a new object. JVM doesn't perform String
pool check if you create object using new operator.
You may face subtle issues if you are not aware of this String behaviour , here is an example
String name = "Scala"; //1st String object
String name_1 = "Scala"; //same object referenced by name variable
String name_2 = new String("Scala") //different String object
//this will return true
if(name==name_1){
System.out.println("both name and name_1 is pointing to same string object");
}
//this will return false
if(name==name_2){
System.out.println("both name and name_2 is pointing to same string object");
}
if you compare name and name_1 using equality operator "==" it will return true because both are pointing to same object. While name==name_2 will return false because they are pointing to different string object. It's worth remembering that equality "==" operator compares object memory location and not characters of String. By default Java puts all string literal into string pool, but you can also put any string into pool by calling intern() method of java.lang.String class, like string created using new() operator.
4) Use
Equals methods for comparing String in Java
String class overrides equals method and provides a content equality,
which is based on characters, case and order. So if you want to compare two
String object, to check whether they are same or not, always use equals() method
instead of equality operator. Like in earlier example if we use equals
method to compare objects, they will be equal to each other because they
all contains same contents. Here is example of comparing String using equals
method.
String name = "Java"; //1st String object
String name_1 = "Java"; //same object referenced by name variable
String name_2 = new String("Java") //different String object
if(name.equals(name_1)){
System.out.println("name and name_1 are equal String by equals method");
}
//this will return false
if(name==name_2){
System.out.println("name_1 and name_2 are equal String by equals method");
}
You can also check my earlier post difference between equals() method and == operator for more detail discussion on consequences of comparing two string using == operator in Java.
5) Use
indexOf() and lastIndexOf() or matches(String regex) method to search inside
String
String class in Java provides convenient method to see if a character or sub-string or a pattern exists in current String object. You can use indexOf() which will
return position of character or String, if that exist in current String object
or -1 if character doesn't exists in String. lastIndexOf is similar
but it searches from end. String.match(String regex) is even
more powerful, which allows you to search for a regular
expression pattern inside String. here is examples of indexOf, lastIndexOf and matches method
from java.lang.String class.
String str = "Java is best programming language";
if(str.indexOf("Java") != -1){
System.out.println("String
contains Java at index :" + str.indexOf("Java"));
}
if(str.matches("J.*")){
System.out.println("String
Starts with J");
}
str ="Do you like Java ME or Java EE";
if(str.lastIndexOf("Java") != -1){
System.out.println("String contains Java lastly at: " + str.lastIndexOf("Java"));
}
As expected indexOf will return 0 because characters in String are indexed from zero. lastIndexOf returns index of second “Java”, which starts at 23 and matches will return true because J.* pattern is any String starting with character J followed by any character because of dot(.) and any number of time due to asterick (*).
Remember matches() is tricky and some time
non-intuitive. If you just put "Java" in matches
it will return false because String is not equals to
"Java" i.e. in case of plain text it behaves like equals method. See here
for more examples of String matches() method.
Apart from indexOf(), lastIndexOf() and matches(String
regex) String also has methods like startsWith() and endsWidth(), which can
be used to check an String if it starting or ending with certain character or
String.
6) Use
SubString to get part of String in Java
Java String provides another useful method called substring(), which can
be used to get parts of String. basically you specify start and end index and substring() method
returns character from that range. Index starts from 0 and goes till String.length()-1. By the
way String.length() returns you number of characters in String,
including white spaces like tab, space. One point which is worth remembering
here is that substring is also backed up by character array, which is used by
original String. This can be dangerous if original string object is very large
and substring is very small, because even a small fraction can hold reference
of complete array and prevents it from being garbage collected even if there is
no other reference for that particular String. Read How
Substring works in Java for more details. Here is an example of using SubString
in Java:
String str = "Java is best programming language";
//this will return part of
String str from index 0 to 12
String subString = str.substring(0,12);
System.out.println("Substring: " + subString);
7)
"+" is overloaded for String concatenation
Java
doesn't support Operator overloading but String is special and + operator
can be used to concatenate two Strings. It can even used to convert int, char, long or double to convert
into String by simply concatenating with empty
string "". internally + is implemented
using StringBuffer prior to Java 5 and StringBuilder from Java
5 onwards. This also brings point of using StringBuffer or StringBuilder for
manipulating String. Since both represent mutable object they can be used to
reduce string garbage created because of temporary String. Read more about StringBuffer
vs StringBuilder here.
8) Use
trim() to remove white spaces from String
String in Java provides trim() method to remove white space
from both end of String. If trim() removes white spaces it
returns a new String otherwise it returns same String. Along with trim() String also provides replace() and replaceAll() method for
replacing characters from String. replaceAll method even
support regular expression. Read more about How to replace String in Java here.
9) Use
split() for splitting String using Regular expression
String in Java is feature rich. it has methods like split(regex) which can
take any String in form of regular expression and split the String based on
that. particularly useful if you dealing with comma separated file (CSV) and
wanted to have individual part in a String array. There are other methods also
available related to splitting String, see this Java
tutorial to split string for more details.
10) Don't
store sensitive data in String
String pose security threat if used for storing sensitive data like
passwords, SSN or any other sensitive information. Since String is immutable in
Java there is no way you can erase contents of String and since they are kept
in String pool (in case of String literal) they stay longer on Java heap ,which
exposes risk of being seen by anyone who has access to Java memory, like
reading from memory dump. Instead char[] should be
used to store password or sensitive information. See Why
char[] is more secure than String for storing passwords in Java for more
details.
11) Character Encoding and String
Apart from all these 10 facts about String in Java, the most critical thing to know is what encoding your String is using. It does not make sense to have a String without knowing what encoding it uses. There is no way to interpret an String if you don't know the encoding it used. You can not assume that "plain" text is ASCII. If you have a String, in memory or stored in file, you must know what encoding it is in, or you cannot display it correctly. By default Java uses platform encoding i.e. character encoding of your server, and believe me this can cause huge trouble if you are handling Unicode data, especially if you are converting byte array to XML String. I have faced instances where our program fail to interpret Strings from European language e.g. German, French etc. because our server was not using Unicode encodings like UTF-8 or UTF-16. Thankfully, Java allows you to specify default character encoding for your application using system property file.encoding. See here to read more about character encoding in Java
11) Character Encoding and String
Apart from all these 10 facts about String in Java, the most critical thing to know is what encoding your String is using. It does not make sense to have a String without knowing what encoding it uses. There is no way to interpret an String if you don't know the encoding it used. You can not assume that "plain" text is ASCII. If you have a String, in memory or stored in file, you must know what encoding it is in, or you cannot display it correctly. By default Java uses platform encoding i.e. character encoding of your server, and believe me this can cause huge trouble if you are handling Unicode data, especially if you are converting byte array to XML String. I have faced instances where our program fail to interpret Strings from European language e.g. German, French etc. because our server was not using Unicode encodings like UTF-8 or UTF-16. Thankfully, Java allows you to specify default character encoding for your application using system property file.encoding. See here to read more about character encoding in Java
That's all about String in Java. As I have said String is very special in
Java, sometime even refer has God class. It has some unique feature like immutability, concatenation
support, caching etc, and to become a serious Java programmer,
detailed knowledge of String is quite important. Last but not the least don't
forget about character
encoding while converting a byte array into String in Java. Good knowledge of java.lang.String is must for good Java developers.
15 comments :
Indeed String is a Special class and special knowledge of String helps a lot. Just to add on this article, I would like to share couple of best practices while I am here :
1) While calling equals() method with String literal, prefer defensive approach e.g. calling equals() on String literal rather than on String object e.g.
"USA".equals(country) will return false if country is null. While country.equals("USA") will throw NullPointerException, if country is null.
2) Always override toString() method, especially for value object, business and domain objects. At the same time, encrypt, mask or simply don't include sensitive information e.g. SSN on toString, because those information may end up on log files, compromising security and confidentiality.
3) Prefer System.out.printf() over System.out.println() for better formatting.
4) Prefer StringBuilder over StirngBuffer over String concatenation.
I love the fact that in Java, developers are more mindful of writing immutable classes. I don't appreciate it before, until of course running into problems later due to concurrency.
It is fun and educational to learn about programming reading the source code of Java, such as the String class.
i thank millions && thank alot && thank you "the writer" so much. i love java since i don't know what is java. i m trying for SCJP now. Your posts are very useful && understandable for me. SO THANKS!!!
Point (1) should say you can get *a copy of* the character array used to represent String in Java by calling toCharArray().
From Java 7u6 onwards, the substring() method does NOT use the underlaying char array anymore, it DOES copy the part of the array it needs.
Regarding the PermGen comments.
From Oracle's Java SE 7 Features and Enhancements:
"In JDK 7, interned strings are no longer allocated in the permanent generation of the Java heap, but are instead allocated in the main part of the Java heap (known as the young and old generations), along with the other objects created by the application."
Garbage collection strategies vary by Java vendor, version, and JVM configuration.
what happens when we use
String str1="abc"
str1="xyz";
GC will clean abc or that will be in literal pool?
Hi Sanoj,
as per example .GC will not clear the abc, because String declaration and creation done by string literal
inside double quotes are String literal in Java are maintained in a separate String pool, special memory location inside java memory .
In String pool will not clean abc immediately .. if String pool(max 200 object will stored ) memory is full then automatically string pool will clear the abc object
4) Advice to prefer StringBuilder is good for when dealing with single threads only. Otherwise, if multiple threads could be accessing it, a StringBuffer (being synchronised) may be what you are after.
I read that When
String s = new String("JAVA"); instruction is executed,
JVM checks SCP for "JAVA" , if SCP already holds a reference to "JAVA" , then only one object in heap will be created other wise two objects will be created in heap where one is referenced in SCP and one will be in String. Please answer .. Reference link http://stackoverflow.com/questions/2009228/strings-are-objects-in-java-so-why-dont-we-use-new-to-create-them
@Javin Very nice article just want to add one more thing to add the value to this article that is ..
In the first case, a new object is being created in each iteration, in the second case, it's always the same object, being retrieved from the String constant pool.
In Java, when you do:
String bla = new String("xpto");
You force the creation of a new String object, this takes up some time and memory.
On the other hand, when you do:
String muchMuchFaster = "xpto"; //String literal!
The String will only be created the first time (a new object), and it'll be cached in the String constant pool, so every time you refer to it in it's literal form, you're getting the exact same object, which is amazingly fast.
Now you may ask... what if two different points in the code retrieve the same literal and change it, aren't there problems bound to happen?!
No, because Strings, in Java, as you may very well know, are immutable! So any operation that would mutate a String returns a new String, leaving any other references to the same literal happy on their way.
This is one of the advantages of immutable data structures, but that's another issue altogether, and I would write a couple of pages on the subject.
Edit
Just a clarification, the constant pool isn't exclusive to String types, you can read more about it here, or if you google for Java constant pool.
http://docs.oracle.com/javase/specs/jvms/se7/jvms7.pdf
Also, a little test you can do to drive the point home:
String a = new String("xpto");
String b = new String("xpto");
String c = "xpto";
String d = "xpto";
System.out.println(a == b);
System.out.println(a == c);
System.out.println(c == d);
With all this, you can probably figure out the results of these Sysouts:
false
false
true
Since c and d are the same object, the == comparison holds true.
as already have been answered the second retrieves the instance from the String pool (remember Strings are immutable).
Additionally you should check the intern() method which enables you to put new String() into a pool in case you do not know the constant value of the string in runtime: e.g:
String s = stringVar.intern();
or
new String(stringVar).intern();
I will add additional fact, you should know that additionally to the String object more info exist in the pool (the hashcode): this enables fast hashMap search by String in the relevant data Strtuctures (instead of recreating the hashcode each time)
Few more things to add into this excellent articles :
1) String are stored as UTF-16 characters and not UTF-8, may be some day they will move to UTF-32 as well.
2) JVM does not intern all strings created by Java code, only String literals are interned. String created using new() is not interned until you explicitly call intern method on them.
what is the function of string?????
Hi
I want to ask
if i created string object like this
String s = "hello";
-> "hello" string literal go into string literal pool
and if create string object like this
String s1 = new String("hello");
-> then "hello" string literal where it store by JVM
on string literal pool? or on heap?
i confused because we are create string object using new operator and JVM will not look into string pool ri8.
Post a Comment