Checking for a not null, not blank String in Java -


i trying check if java string not null, not empty , not whitespace.

in mind, code should have been quite job.

public static boolean isempty(string s) {     if ((s != null) && (s.trim().length() > 0))         return false;     else         return true; } 

as per documentation, string.trim() should work thus:

returns copy of string, leading , trailing whitespace omitted.

if string object represents empty character sequence, or first , last characters of character sequence represented string object both have codes greater '\u0020' (the space character), reference string object returned.

however, apache/commons/lang/stringutils.java little differently.

public static boolean isblank(string str) {     int strlen;     if (str == null || (strlen = str.length()) == 0) {         return true;     }     (int = 0; < strlen; i++) {         if ((character.iswhitespace(str.charat(i)) == false)) {             return false;         }     }     return true; } 

as per documentation, character.iswhitespace():

determines if specified character white space according java. character java whitespace character if , if satisfies 1 of following criteria:

  • it unicode space character (space_separator, line_separator, or paragraph_separator) not non-breaking space ('\u00a0', '\u2007', '\u202f').
  • it '\t', u+0009 horizontal tabulation.
  • it '\n', u+000a line feed.
  • it '\u000b', u+000b vertical tabulation.
  • it '\f', u+000c form feed.
  • it '\r', u+000d carriage return.
  • it '\u001c', u+001c file separator.
  • it '\u001d', u+001d group separator.
  • it '\u001e', u+001e record separator.
  • it '\u001f', u+001f unit separator.

if not mistaken - or might not reading correctly - string.trim() should take away of characters being checked character.iswhitespace(). of them see above '\u0020'.

in case, simpler isempty function seems covering scenarios lengthier isblank covering.

  1. is there string make isempty , isblank behave differently in test case?
  2. assuming there none, there other consideration because of should choose isblank , not use isempty?

for interested in running test, here methods , unit tests.

public class stringutil {      public static boolean isempty(string s) {         if ((s != null) && (s.trim().length() > 0))             return false;         else             return true;     }      public static boolean isblank(string str) {         int strlen;         if (str == null || (strlen = str.length()) == 0) {             return true;         }         (int = 0; < strlen; i++) {             if ((character.iswhitespace(str.charat(i)) == false)) {                 return false;             }         }         return true;     } } 

and unit tests

@test public void test() {      string s = null;      asserttrue(stringutil.isempty(s)) ;     asserttrue(stringutil.isblank(s)) ;      s = "";      asserttrue(stringutil.isempty(s)) ;     asserttrue(stringutil.isblank(s));       s = " ";      asserttrue(stringutil.isempty(s)) ;     asserttrue(stringutil.isblank(s)) ;      s = "   ";      asserttrue(stringutil.isempty(s)) ;     asserttrue(stringutil.isblank(s)) ;      s = "       ";      asserttrue(stringutil.isempty(s)==false) ;         asserttrue(stringutil.isblank(s)==false) ;         } 

update: interesting discussion - , why love stack overflow , folks here. way, coming question, got:

  • a program showing characters make behave differently. code @ https://ideone.com/ely5wv. @dukeling.
  • a performance related reason choosing standard isblank(). @devconsole.
  • a comprehensive explanation @nhahtdh. mate.

is there string make isempty , isblank behave differently in test case?

note character.iswhitespace can recognize unicode characters , return true unicode whitespace characters.

determines if specified character white space according java. character java whitespace character if , if satisfies 1 of following criteria:

  • it unicode space character (space_separator, line_separator, or paragraph_separator) not non-breaking space ('\u00a0', '\u2007', '\u202f').

  • [...]

on other hand, trim() method trim control characters code points below u+0020 , space character (u+0020).

therefore, 2 methods behave differently @ presence of unicode whitespace character. example: "\u2008". or when string contains control characters not consider whitespace character.iswhitespace method. example: "\002".

if write regular expression (which slower doing loop through string , check):

  • isempty() equivalent .matches("[\\x00-\\x20]*")
  • isblank() equivalent .matches("\\p{javawhitespace}*")

(the isempty() , isblank() method both allow null string reference, not equivalent regex solution, putting aside, equivalent).

note \p{javawhitespace}, name implied, java-specific syntax access character class defined character.iswhitespace method.

assuming there none, there other consideration because of should choose isblank , not use isempty?

it depends. however, think explanation in part above should sufficient decide. sum difference:

  • isempty() consider string empty if contains control characters1 below u+0020 , space character (u+0020)

  • isblank consider string empty if contains whitespace characters defined character.iswhitespace method, includes unicode whitespace characters.

1 there control character @ u+007f delete, not trimmed trim() method.


Comments