- java.lang.Object
-
- org.apache.commons.csv.CSVParser
-
- All Implemented Interfaces:
java.io.Closeable
,java.lang.AutoCloseable
,java.lang.Iterable<CSVRecord>
public final class CSVParser extends java.lang.Object implements java.lang.Iterable<CSVRecord>, java.io.Closeable
Parses CSV files according to the specified format. Because CSV appears in many different dialects, the parser supports many formats by allowing the specification of aCSVFormat
. The parser works record wise. It is not possible to go back, once a record has been parsed from the input stream.Creating instances
There are several static factory methods that can be used to create instances for various types of resources:
parse(java.io.File, Charset, CSVFormat)
parse(String, CSVFormat)
parse(java.net.URL, java.nio.charset.Charset, CSVFormat)
Alternatively parsers can also be created by passing a
Reader
directly to the sole constructor. For those who like fluent APIs, parsers can be created usingCSVFormat.parse(java.io.Reader)
as a shortcut:for(CSVRecord record : CSVFormat.EXCEL.parse(in)) { ... }
Parsing record wise
To parse a CSV input from a file, you write:
File csvData = new File("/path/to/csv"); CSVParser parser = CSVParser.parse(csvData, CSVFormat.RFC4180); for (CSVRecord csvRecord : parser) { ... }
This will read the parse the contents of the file using the RFC 4180 format.
To parse CSV input in a format like Excel, you write:
CSVParser parser = CSVParser.parse(csvData, CSVFormat.EXCEL); for (CSVRecord csvRecord : parser) { ... }
If the predefined formats don't match the format at hands, custom formats can be defined. More information about customising CSVFormats is available in
CSVFormat Javadoc
.Parsing into memory
If parsing record wise is not desired, the contents of the input can be read completely into memory.
Reader in = new StringReader("a;b\nc;d"); CSVParser parser = new CSVParser(in, CSVFormat.EXCEL); List<CSVRecord> list = parser.getRecords();
There are two constraints that have to be kept in mind:
- Parsing into memory starts at the current position of the parser. If you have already parsed records from the input, those records will not end up in the in memory representation of your CSV data.
- Parsing into memory may consume a lot of system resources depending on the input. For example if you're parsing a 150MB file of CSV data the contents will be read completely into memory.
Notes
Internal parser state is completely covered by the format and the reader-state.
- See Also:
- package documentation for more details
-
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
close()
Closes resources.long
getCurrentLineNumber()
Returns the current line number in the input stream.java.lang.String
getFirstEndOfLine()
Gets the first end-of-line string encountered.java.util.Map<java.lang.String,java.lang.Integer>
getHeaderMap()
Returns a copy of the header map that iterates in column order.long
getRecordNumber()
Returns the current record number in the input stream.java.util.List<CSVRecord>
getRecords()
Parses the CSV input according to the given format and returns the content as a list ofCSVRecords
.boolean
isClosed()
Gets whether this parser is closed.java.util.Iterator<CSVRecord>
iterator()
Returns an iterator on the records.static CSVParser
parse(java.io.File file, java.nio.charset.Charset charset, CSVFormat format)
Creates a parser for the givenFile
.static CSVParser
parse(java.io.InputStream inputStream, java.nio.charset.Charset charset, CSVFormat format)
Creates a CSV parser using the givenCSVFormat
.static CSVParser
parse(java.io.Reader reader, CSVFormat format)
Creates a CSV parser using the givenCSVFormat
static CSVParser
parse(java.lang.String string, CSVFormat format)
Creates a parser for the givenString
.static CSVParser
parse(java.net.URL url, java.nio.charset.Charset charset, CSVFormat format)
Creates a parser for the given URL.static CSVParser
parse(java.nio.file.Path path, java.nio.charset.Charset charset, CSVFormat format)
Creates a parser for the givenPath
.
-
-
-
Constructor Detail
-
CSVParser
public CSVParser(java.io.Reader reader, CSVFormat format) throws java.io.IOException
Customized CSV parser using the givenCSVFormat
If you do not read all records from the given
reader
, you should callclose()
on the parser, unless you close thereader
.- Parameters:
reader
- a Reader containing CSV-formatted input. Must not be null.format
- the CSVFormat used for CSV parsing. Must not be null.- Throws:
java.lang.IllegalArgumentException
- If the parameters of the format are inconsistent or if either reader or format are null.java.io.IOException
- If there is a problem reading the header or skipping the first record
-
CSVParser
public CSVParser(java.io.Reader reader, CSVFormat format, long characterOffset, long recordNumber) throws java.io.IOException
Customized CSV parser using the givenCSVFormat
If you do not read all records from the given
reader
, you should callclose()
on the parser, unless you close thereader
.- Parameters:
reader
- a Reader containing CSV-formatted input. Must not be null.format
- the CSVFormat used for CSV parsing. Must not be null.characterOffset
- Lexer offset when the parser does not start parsing at the beginning of the source.recordNumber
- The next record number to assign- Throws:
java.lang.IllegalArgumentException
- If the parameters of the format are inconsistent or if either reader or format are null.java.io.IOException
- If there is a problem reading the header or skipping the first record- Since:
- 1.1
-
-
Method Detail
-
parse
public static CSVParser parse(java.io.File file, java.nio.charset.Charset charset, CSVFormat format) throws java.io.IOException
Creates a parser for the givenFile
.- Parameters:
file
- a CSV file. Must not be null.charset
- A Charsetformat
- the CSVFormat used for CSV parsing. Must not be null.- Returns:
- a new parser
- Throws:
java.lang.IllegalArgumentException
- If the parameters of the format are inconsistent or if either file or format are null.java.io.IOException
- If an I/O error occurs
-
parse
public static CSVParser parse(java.io.InputStream inputStream, java.nio.charset.Charset charset, CSVFormat format) throws java.io.IOException
Creates a CSV parser using the givenCSVFormat
.If you do not read all records from the given
reader
, you should callclose()
on the parser, unless you close thereader
.- Parameters:
inputStream
- an InputStream containing CSV-formatted input. Must not be null.charset
- a Charset.format
- the CSVFormat used for CSV parsing. Must not be null.- Returns:
- a new CSVParser configured with the given reader and format.
- Throws:
java.lang.IllegalArgumentException
- If the parameters of the format are inconsistent or if either reader or format are null.java.io.IOException
- If there is a problem reading the header or skipping the first record- Since:
- 1.5
-
parse
public static CSVParser parse(java.nio.file.Path path, java.nio.charset.Charset charset, CSVFormat format) throws java.io.IOException
Creates a parser for the givenPath
.- Parameters:
path
- a CSV file. Must not be null.charset
- A Charsetformat
- the CSVFormat used for CSV parsing. Must not be null.- Returns:
- a new parser
- Throws:
java.lang.IllegalArgumentException
- If the parameters of the format are inconsistent or if either file or format are null.java.io.IOException
- If an I/O error occurs- Since:
- 1.5
-
parse
public static CSVParser parse(java.io.Reader reader, CSVFormat format) throws java.io.IOException
Creates a CSV parser using the givenCSVFormat
If you do not read all records from the given
reader
, you should callclose()
on the parser, unless you close thereader
.- Parameters:
reader
- a Reader containing CSV-formatted input. Must not be null.format
- the CSVFormat used for CSV parsing. Must not be null.- Returns:
- a new CSVParser configured with the given reader and format.
- Throws:
java.lang.IllegalArgumentException
- If the parameters of the format are inconsistent or if either reader or format are null.java.io.IOException
- If there is a problem reading the header or skipping the first record- Since:
- 1.5
-
parse
public static CSVParser parse(java.lang.String string, CSVFormat format) throws java.io.IOException
Creates a parser for the givenString
.- Parameters:
string
- a CSV string. Must not be null.format
- the CSVFormat used for CSV parsing. Must not be null.- Returns:
- a new parser
- Throws:
java.lang.IllegalArgumentException
- If the parameters of the format are inconsistent or if either string or format are null.java.io.IOException
- If an I/O error occurs
-
parse
public static CSVParser parse(java.net.URL url, java.nio.charset.Charset charset, CSVFormat format) throws java.io.IOException
Creates a parser for the given URL.If you do not read all records from the given
url
, you should callclose()
on the parser, unless you close theurl
.- Parameters:
url
- a URL. Must not be null.charset
- the charset for the resource. Must not be null.format
- the CSVFormat used for CSV parsing. Must not be null.- Returns:
- a new parser
- Throws:
java.lang.IllegalArgumentException
- If the parameters of the format are inconsistent or if either url, charset or format are null.java.io.IOException
- If an I/O error occurs
-
close
public void close() throws java.io.IOException
Closes resources.- Specified by:
close
in interfacejava.lang.AutoCloseable
- Specified by:
close
in interfacejava.io.Closeable
- Throws:
java.io.IOException
- If an I/O error occurs
-
getCurrentLineNumber
public long getCurrentLineNumber()
Returns the current line number in the input stream.ATTENTION: If your CSV input has multi-line values, the returned number does not correspond to the record number.
- Returns:
- current line number
-
getFirstEndOfLine
public java.lang.String getFirstEndOfLine()
Gets the first end-of-line string encountered.- Returns:
- the first end-of-line string
- Since:
- 1.5
-
getHeaderMap
public java.util.Map<java.lang.String,java.lang.Integer> getHeaderMap()
Returns a copy of the header map that iterates in column order.The map keys are column names. The map values are 0-based indices.
- Returns:
- a copy of the header map that iterates in column order.
-
getRecordNumber
public long getRecordNumber()
Returns the current record number in the input stream.ATTENTION: If your CSV input has multi-line values, the returned number does not correspond to the line number.
- Returns:
- current record number
-
getRecords
public java.util.List<CSVRecord> getRecords() throws java.io.IOException
Parses the CSV input according to the given format and returns the content as a list ofCSVRecords
.The returned content starts at the current parse-position in the stream.
- Returns:
- list of
CSVRecords
, may be empty - Throws:
java.io.IOException
- on parse error or input read-failure
-
isClosed
public boolean isClosed()
Gets whether this parser is closed.- Returns:
- whether this parser is closed.
-
iterator
public java.util.Iterator<CSVRecord> iterator()
Returns an iterator on the records.An
IOException
caught during the iteration are re-thrown as anIllegalStateException
.If the parser is closed a call to
Iterator.next()
will throw aNoSuchElementException
.- Specified by:
iterator
in interfacejava.lang.Iterable<CSVRecord>
-
-