java - Strange behaviour of String.length() -

i have class main:

public class main {  // args[0] - path file first , last words // args[1] - path file dictionary  public static void main(string[] args) {     try {         list<string> firstlastwords = fileparser.getwords(args[0]);             system.out.println(firstlastwords);         system.out.println(firstlastwords.get(0).length());      } catch (ioexception ex) {         ex.printstacktrace();     } } }

and have fileparser:

public class fileparser {      public fileparser() {     }      final static charset encoding = standardcharsets.utf_8;       public static list<string> getwords(string filepath) throws ioexception {         list<string> list = new arraylist<string>();         path path = paths.get(filepath);          try (bufferedreader reader = files.newbufferedreader(path, encoding)) {             string line = null;             while ((line = reader.readline()) != null) {                  string line1 = line.replaceall("\\s+","");                 if (!line1.equals("") && !line1.equals(" ") ){                     list.add(line1);                 }             }             reader.close();         }         return list;     }    }

args[0] path txt file 2 words. if file contains:

тор кит

programm returns:

[тор, кит] 4

if file contains:

т тор кит

programm returns:

[т, тор, кит] 2

if file contains:
//jump next line
тор
кит

programm returns:

[, тор, кит] 1

where digit - length of first string in list.

so question why counts 1 more symbol?

thanks all.

this symbol said @bill bom (http://en.wikipedia.org/wiki/byte_order_mark) , reside @ beginning of text file. found symbol line:

system.out.println(((int)firstlastwords.get(0).charat(0)));

it gave me 65279

then changed line:
string line1 = line.replaceall("\\s+",""); this

string line1 = line.replaceall("\ufeff","");

Autos

Search This Blog

java - Strange behaviour of String.length() -