i have unicode file needs exported database(vertica). column delimiter ctrl+b, record delimiter newline(\n). whenever there newline within column value, ctrl+a used escape character.
when use bufferedreader.readline() read file, records id's 2 , 4, read 2 records. whereas want read them single whole record given in output.
here example input file. | stands ctrl+b , ^ stands ctrl+a.
input id|name|job desc ---------------- 1|xxxx|so job 2|yyyy|so careers^ job 3|rrrrr|so 4|zzzz^ zz|so job 5|aaaa|yu output: id|name|job desc ---------------- 1|xxxx|so job 2|yyyy|so careers job 3|rrrrr|so 4|zzzz zz|so job 5|aaaa|yu
the file huge, cant use stringescapeutils. suggestions on this?
you can use scanner
custom delimeter. delimeter use set match \n
not \u0001\n
(where \u0001
represents ctrl+a
):
try { printwriter writer = new printwriter("dboutput.txt"); scanner sc = new scanner(new file("dbinput.txt")); sc.usedelimiter(pattern.compile("^(?!.*(\\u0001\\n)).*\\n$")); while (sc.hasnext()) { writer.println(sc.next()); } scanner.close(); writer.close(); } catch (filenotfoundexception e) { e.printstacktrace(); }