class CSV
This class provides a complete interface to CSV files and data. It offers tools to enable you to read and write to and from Strings or IO objects, as needed.
The most generic interface of the library is:
csv = CSV.new(io, **options) # Reading: IO object should be open for read csv.read # => array of rows # or csv.each do |row| # ... end # or row = csv.shift # Writing: IO object should be open for write csv << row
There are several specialized class methods for one-statement reading or writing, described in the Specialized Methods section.
If a String is passed into ::new, it is internally wrapped into a StringIO object.
options can be used for specifying the particular CSV flavor (column separators, row separators, value quoting and so on), and for data conversion, see Data Conversion section for the description of the latter.
Specialized Methods¶ ↑
Reading¶ ↑
# From a file: all at once arr_of_rows = CSV.read("path/to/file.csv", **options) # iterator-style: CSV.foreach("path/to/file.csv", **options) do |row| # ... end # From a string arr_of_rows = CSV.parse("CSV,data,String", **options) # or CSV.parse("CSV,data,String", **options) do |row| # ... end
Writing¶ ↑
# To a file CSV.open("path/to/file.csv", "wb") do |csv| csv << ["row", "of", "CSV", "data"] csv << ["another", "row"] # ... end # To a String csv_string = CSV.generate do |csv| csv << ["row", "of", "CSV", "data"] csv << ["another", "row"] # ... end
Shortcuts¶ ↑
# Core extensions for converting one line csv_string = ["CSV", "data"].to_csv # to CSV csv_array = "CSV,String".parse_csv # from CSV # CSV() method CSV { |csv_out| csv_out << %w{my data here} } # to $stdout CSV(csv = "") { |csv_str| csv_str << %w{my data here} } # to a String CSV($stderr) { |csv_err| csv_err << %w{my data here} } # to $stderr CSV($stdin) { |csv_in| csv_in.each { |row| p row } } # from $stdin
Options¶ ↑
The default values for options are:
DEFAULT_OPTIONS = { # For both parsing and generating. col_sep: ",", row_sep: :auto, quote_char: '"', # For parsing. field_size_limit: nil, converters: nil, unconverted_fields: nil, headers: false, return_headers: false, header_converters: nil, skip_blanks: false, skip_lines: nil, liberal_parsing: false, nil_value: nil, empty_value: "", # For generating. write_headers: nil, quote_empty: true, force_quotes: false, write_converters: nil, write_nil_value: nil, write_empty_value: "", strip: false, }
Options for Parsing¶ ↑
Option col_sep¶ ↑
Specifies the String field separator to be used for both parsing and generating. The String will be transcoded into the data’s Encoding before use.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:col_sep) # => "," (comma)
For examples in this section:
ary = ['a', 'b', 'c']
Using the default:
str = CSV.generate_line(line) str # => "a,b,c\n" ary = CSV.parse_line(str) ary # => ["a", "b", "c"]
Using : (colon):
col_sep = ':' str = CSV.generate_line(ary, col_sep: col_sep) str # => "a:b:c\n" ary = CSV.parse_line(str, col_sep: col_sep) ary # => [["a", "b", "c"]]
Using :: (two colons):
col_sep = '::' str = CSV.generate_line(ary, col_sep: col_sep) str # => "a::b::c\n" ary = CSV.parse_line(str, col_sep: col_sep) ary # => [["a", "b", "c"]]
Raises an exception if given the empty String:
col_sep = '' # Raises ArgumentError (:col_sep must be 1 or more characters: "") CSV.parse_line("a:b:c\n", col_sep: col_sep)
Raises an exception if the given value is not String-convertible:
col_sep = BasicObject.new # Raises NoMethodError (undefined method `to_s' for #<BasicObject:>) CSV.generate_line(line, col_sep: col_sep) # Raises NoMethodError (undefined method `to_s' for #<BasicObject:>) CSV.parse(str, col_sep: col_sep)
Option row_sep¶ ↑
Specifies the row separator, a String or the Symbol :auto (see below), to be used for both parsing and generating.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:row_sep) # => :auto
When row_sep is a String, that String becomes the row separator. The String will be transcoded into the data’s Encoding before use.
Using "\n":
str = CSV.generate do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str # => "foo,0\nbar,1\nbaz,2\n" ary = CSV.parse(str) ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Using | (pipe):
row_sep = '|' str = CSV.generate(row_sep: row_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str # => "foo,0|bar,1|baz,2|" ary = CSV.parse(str, row_sep: row_sep) ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Using -- (two hyphens):
row_sep = '--' str = CSV.generate(row_sep: row_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str # => "foo,0--bar,1--baz,2--" ary = CSV.parse(str, row_sep: row_sep) ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Using '' (empty string):
row_sep = '' str = CSV.generate(row_sep: row_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str # => "foo,0bar,1baz,2" ary = CSV.parse(str, row_sep: row_sep) ary # => [["foo", "0bar", "1baz", "2"]]
When row_sep is the Symbol :auto (the default), invokes auto-discovery of the row separator.
Auto-discovery reads ahead in the data looking for the next \r\n, \n, or \r sequence. The sequence will be selected even if it occurs in a quoted field, assuming that you would have the same line endings there.
row_sep = :auto str = CSV.generate(row_sep: row_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str # => "foo,0\nbar,1\nbaz,2\n" ary = CSV.parse(str, row_sep: row_sep) ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
The default $INPUT_RECORD_SEPARATOR ($/) is used if any of the following is true:
-
None of those sequences is found.
-
Data is
ARGF,STDIN,STDOUT, orSTDERR. -
The stream is only available for output.
Obviously, discovery takes a little time. Set manually if speed is important. Also note that IO objects should be opened in binary mode on Windows if this feature will be used as the line-ending translation can cause problems with resetting the document position to where it was before the read ahead.
Raises an exception if the given value is not String-convertible:
row_sep = BasicObject.new # Raises NoMethodError (undefined method `to_s' for #<BasicObject:>) CSV.generate_line(ary, row_sep: row_sep) # Raises NoMethodError (undefined method `to_s' for #<BasicObject:>) CSV.parse(str, row_sep: row_sep)
Option quote_char¶ ↑
Specifies the character (String of length 1) used used to quote fields in both parsing and generating. This String will be transcoded into the data’s Encoding before use.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:quote_char) # => "\"" (backslash)
This is useful for an application that incorrectly uses ' (single-quote) to quote fields, instead of the correct " (double-quote).
Using the default:
ary = ['a', 'b', '"c"', 'd'] str = CSV.generate_line(ary) str # => "a,b,\"\"\"c\"\"\",d\n" ary = CSV.parse_line(str) ary # => ["a", "b", "\"c\"", "d"]
Using ' (single-quote):
quote_char = "'" ary = ['a', 'b', '\'c\'', 'd'] str = CSV.generate_line(ary, quote_char: quote_char) str # => "a,b,'''c''',d\n" ary = CSV.parse_line(str, quote_char: quote_char) ary # => [["a", "b", "'c'", "d"]]
Raises an exception if the String length is greater than 1:
# Raises ArgumentError (:quote_char has to be nil or a single character String) CSV.new('', quote_char: 'xx')
Option field_size_limit¶ ↑
Specifies the Integer field size limit.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:field_size_limit) # => nil
This is a maximum size CSV will read ahead looking for the closing quote for a field. (In truth, it reads to the first line ending beyond this size.) If a quote cannot be found within the limit CSV will raise a MalformedCSVError, assuming the data is faulty. You can use this limit to prevent what are effectively DoS attacks on the parser. However, this limit can cause a legitimate parse to fail; therefore the default value is nil (no limit).
For the examples in this section:
str = <<~EOT "a","b" " 2345 ","" EOT str # => "\"a\",\"b\"\n\"\n2345\n\",\"\"\n"
Using the default nil:
ary = CSV.parse(str) ary # => [["a", "b"], ["\n2345\n", ""]]
Using 50:
field_size_limit = 50 ary = CSV.parse(str, field_size_limit: field_size_limit) ary # => [["a", "b"], ["\n2345\n", ""]]
Raises an exception if a field is too long:
big_str = "123456789\n" * 1024 # Raises CSV::MalformedCSVError (Field size exceeded in line 1.) CSV.parse('valid,fields,"' + big_str + '"', field_size_limit: 2048)
Option converters¶ ↑
Specifies a single field converter name or Proc, or an Array of field converter names and Procs.
See Field Converters
Default value:
CSV::DEFAULT_OPTIONS.fetch(:converters) # => nil
The value may be a single field converter name:
str = '1,2,3' # Without a converter ary = CSV.parse_line(str) ary # => ["1", "2", "3"] # With built-in converter :integer ary = CSV.parse_line(str, converters: :integer) ary # => [1, 2, 3]
The value may be an Array of field converter names:
str = '1,3.14159' # Without converters ary = CSV.parse_line(str) ary # => ["1", "3.14159"] # With built-in converters ary = CSV.parse_line(str, converters: [:integer, :float]) ary # => [1, 3.14159]
The value may be a Proc custom converter:
str = ' foo , bar , baz ' # Without a converter ary = CSV.parse_line(str) ary # => [" foo ", " bar ", " baz "] # With a custom converter ary = CSV.parse_line(str, converters: proc {|field| field.strip }) ary # => ["foo", "bar", "baz"]
See also Custom Converters
Raises an exception if the converter is not a converter name or a Proc:
str = 'foo,0' # Raises NoMethodError (undefined method `arity' for nil:NilClass) CSV.parse(str, converters: :foo)
Option unconverted_fields¶ ↑
Specifies the boolean that determines whether unconverted field values are to be available.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:unconverted_fields) # => nil
The unconverted field values are those found in the source data, prior to any conversions performed via option converters.
When option unconverted_fields is true, each returned row (Array or CSV::Row) has an added method, unconverted_fields, that returns the unconverted field values:
str = <<-EOT foo,0 bar,1 baz,2 EOT # Without unconverted_fields csv = CSV.parse(str, converters: :integer) csv # => [["foo", 0], ["bar", 1], ["baz", 2]] csv.first.respond_to?(:unconverted_fields) # => false # With unconverted_fields csv = CSV.parse(str, converters: :integer, unconverted_fields: true) csv # => [["foo", 0], ["bar", 1], ["baz", 2]] csv.first.respond_to?(:unconverted_fields) # => true csv.first.unconverted_fields # => ["foo", "0"]
Option headers¶ ↑
Specifies a boolean, Symbol, Array, or String to be used to define column headers.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:headers) # => false
Without headers:
str = <<-EOT Name,Count foo,0 bar,1 bax,2 EOT csv = CSV.new(str) csv # => #<CSV io_type:StringIO encoding:UTF-8 lineno:0 col_sep:"," row_sep:"\n" quote_char:"\""> csv.headers # => nil csv.shift # => ["Name", "Count"]
If set to true or the Symbol :first_row, the first row of the data is treated as a row of headers:
str = <<-EOT Name,Count foo,0 bar,1 bax,2 EOT csv = CSV.new(str, headers: true) csv # => #<CSV io_type:StringIO encoding:UTF-8 lineno:2 col_sep:"," row_sep:"\n" quote_char:"\"" headers:["Name", "Count"]> csv.headers # => ["Name", "Count"] csv.shift # => #<CSV::Row "Name":"bar" "Count":"1">
If set to an Array, the Array elements are treated as headers:
str = <<-EOT foo,0 bar,1 bax,2 EOT csv = CSV.new(str, headers: ['Name', 'Count']) csv csv.headers # => ["Name", "Count"] csv.shift # => #<CSV::Row "Name":"bar" "Count":"1">
If set to a String str, method CSV::parse_line(str, options) is called with the current options, and the returned Array is treated as headers:
str = <<-EOT foo,0 bar,1 bax,2 EOT csv = CSV.new(str, headers: 'Name,Count') csv csv.headers # => ["Name", "Count"] csv.shift # => #<CSV::Row "Name":"bar" "Count":"1">
Option return_headers¶ ↑
Specifies the boolean that determines whether method shift returns or ignores the header row.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:return_headers) # => false
Examples:
str = <<-EOT Name,Count foo,0 bar,1 bax,2 EOT # Without return_headers first row is str. csv = CSV.new(str, headers: true) csv.shift # => #<CSV::Row "Name":"foo" "Count":"0"> # With return_headers first row is headers. csv = CSV.new(str, headers: true, return_headers: true) csv.shift # => #<CSV::Row "Name":"Name" "Count":"Count">
Option header_converters¶ ↑
Specifies a String converter name or an Array of converter names.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:header_converters) # => nil
Identical in functionality to option converters except that:
-
The converters apply only to the header row.
-
The built-in header converters are
:downcaseand:symbol.
Examples:
str = <<-EOT foo,0 bar,1 baz,2 EOT headers = ['Name', 'Value'] # With no header converter csv = CSV.parse(str, headers: headers) csv.headers # => ["Name", "Value"] # With header converter :downcase csv = CSV.parse(str, headers: headers, header_converters: :downcase) csv.headers # => ["name", "value"] # With header converter :symbol csv = CSV.parse(str, headers: headers, header_converters: :symbol) csv.headers # => [:name, :value] # With both csv = CSV.parse(str, headers: headers, header_converters: [:downcase, :symbol]) csv.headers # => [:name, :value]
Option skip_blanks¶ ↑
Specifies a boolean that determines whether blank lines in the input will be ignored; a line that contains a column separator is not considered to be blank.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:skip_blanks) # => false
See also option skiplines.
For examples in this section:
str = <<-EOT foo,0 bar,1 baz,2 , EOT
Using the default, false:
ary = CSV.parse(str) ary # => [["foo", "0"], [], ["bar", "1"], ["baz", "2"], [], [nil, nil]]
Using true:
ary = CSV.parse(str, skip_blanks: true) ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"], [nil, nil]]
Using a truthy value:
ary = CSV.parse(str, skip_blanks: :foo) ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"], [nil, nil]]
Option skip_lines¶ ↑
Specifies an object to use in identifying comment lines in the input that are to be ignored:
-
If a Regexp, ignores lines that match it.
-
If a String, converts it to a Regexp, ignores lines that match it.
-
If
nil, no lines are considered to be comments.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:skip_lines) # => nil
For examples in this section:
str = <<-EOT # Comment foo,0 bar,1 baz,2 # Another comment EOT str # => "# Comment\nfoo,0\nbar,1\nbaz,2\n# Another comment\n"
Using the default, nil:
ary = CSV.parse(str) ary # => [["# Comment"], ["foo", "0"], ["bar", "1"], ["baz", "2"], ["# Another comment"]]
Using a Regexp:
ary = CSV.parse(str, skip_lines: /^#/) ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Using a String:
ary = CSV.parse(str, skip_lines: '#') ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Raises an exception if given an object that is not a Regexp, a String, or nil:
# Raises ArgumentError (:skip_lines has to respond to #match: 0) CSV.parse(str, skip_lines: 0)
Option liberal_parsing¶ ↑
Specifies the boolean value that determines whether CSV will attempt to parse input not conformant with RFC 4180, such as double quotes in unquoted fields.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:liberal_parsing) # => false
For examples in this section:
str = 'is,this "three, or four",fields'
Without liberal_parsing:
# Raises CSV::MalformedCSVError (Illegal quoting in str 1.) CSV.parse_line(str)
With liberal_parsing:
ary = CSV.parse_line(str, liberal_parsing: true) ary # => ["is", "this \"three", " or four\"", "fields"]
Option nil_value¶ ↑
Specifies the object that is to be substituted for each null (no-text) field.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:nil_value) # => nil
With the default, nil:
CSV.parse_line('a,,b,,c') # => ["a", nil, "b", nil, "c"]
With a different object:
CSV.parse_line('a,,b,,c', nil_value: 0) # => ["a", 0, "b", 0, "c"]
Option empty_value¶ ↑
Specifies the object that is to be substituted for each field that has an empty String.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:empty_value) # => "" (empty string)
With the default, "":
CSV.parse_line('a,"",b,"",c') # => ["a", "", "b", "", "c"]
With a different object:
CSV.parse_line('a,"",b,"",c', empty_value: 'x') # => ["a", "x", "b", "x", "c"]
Options for Generating¶ ↑
Option col_sep¶ ↑
Specifies the String field separator to be used for both parsing and generating. The String will be transcoded into the data’s Encoding before use.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:col_sep) # => "," (comma)
For examples in this section:
ary = ['a', 'b', 'c']
Using the default:
str = CSV.generate_line(line) str # => "a,b,c\n" ary = CSV.parse_line(str) ary # => ["a", "b", "c"]
Using : (colon):
col_sep = ':' str = CSV.generate_line(ary, col_sep: col_sep) str # => "a:b:c\n" ary = CSV.parse_line(str, col_sep: col_sep) ary # => [["a", "b", "c"]]
Using :: (two colons):
col_sep = '::' str = CSV.generate_line(ary, col_sep: col_sep) str # => "a::b::c\n" ary = CSV.parse_line(str, col_sep: col_sep) ary # => [["a", "b", "c"]]
Raises an exception if given the empty String:
col_sep = '' # Raises ArgumentError (:col_sep must be 1 or more characters: "") CSV.parse_line("a:b:c\n", col_sep: col_sep)
Raises an exception if the given value is not String-convertible:
col_sep = BasicObject.new # Raises NoMethodError (undefined method `to_s' for #<BasicObject:>) CSV.generate_line(line, col_sep: col_sep) # Raises NoMethodError (undefined method `to_s' for #<BasicObject:>) CSV.parse(str, col_sep: col_sep)
Option row_sep¶ ↑
Specifies the row separator, a String or the Symbol :auto (see below), to be used for both parsing and generating.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:row_sep) # => :auto
When row_sep is a String, that String becomes the row separator. The String will be transcoded into the data’s Encoding before use.
Using "\n":
str = CSV.generate do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str # => "foo,0\nbar,1\nbaz,2\n" ary = CSV.parse(str) ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Using | (pipe):
row_sep = '|' str = CSV.generate(row_sep: row_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str # => "foo,0|bar,1|baz,2|" ary = CSV.parse(str, row_sep: row_sep) ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Using -- (two hyphens):
row_sep = '--' str = CSV.generate(row_sep: row_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str # => "foo,0--bar,1--baz,2--" ary = CSV.parse(str, row_sep: row_sep) ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Using '' (empty string):
row_sep = '' str = CSV.generate(row_sep: row_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str # => "foo,0bar,1baz,2" ary = CSV.parse(str, row_sep: row_sep) ary # => [["foo", "0bar", "1baz", "2"]]
When row_sep is the Symbol :auto (the default), invokes auto-discovery of the row separator.
Auto-discovery reads ahead in the data looking for the next \r\n, \n, or \r sequence. The sequence will be selected even if it occurs in a quoted field, assuming that you would have the same line endings there.
row_sep = :auto str = CSV.generate(row_sep: row_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str # => "foo,0\nbar,1\nbaz,2\n" ary = CSV.parse(str, row_sep: row_sep) ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
The default $INPUT_RECORD_SEPARATOR ($/) is used if any of the following is true:
-
None of those sequences is found.
-
Data is
ARGF,STDIN,STDOUT, orSTDERR. -
The stream is only available for output.
Obviously, discovery takes a little time. Set manually if speed is important. Also note that IO objects should be opened in binary mode on Windows if this feature will be used as the line-ending translation can cause problems with resetting the document position to where it was before the read ahead.
Raises an exception if the given value is not String-convertible:
row_sep = BasicObject.new # Raises NoMethodError (undefined method `to_s' for #<BasicObject:>) CSV.generate_line(ary, row_sep: row_sep) # Raises NoMethodError (undefined method `to_s' for #<BasicObject:>) CSV.parse(str, row_sep: row_sep)
Option quote_char¶ ↑
Specifies the character (String of length 1) used used to quote fields in both parsing and generating. This String will be transcoded into the data’s Encoding before use.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:quote_char) # => "\"" (backslash)
This is useful for an application that incorrectly uses ' (single-quote) to quote fields, instead of the correct " (double-quote).
Using the default:
ary = ['a', 'b', '"c"', 'd'] str = CSV.generate_line(ary) str # => "a,b,\"\"\"c\"\"\",d\n" ary = CSV.parse_line(str) ary # => ["a", "b", "\"c\"", "d"]
Using ' (single-quote):
quote_char = "'" ary = ['a', 'b', '\'c\'', 'd'] str = CSV.generate_line(ary, quote_char: quote_char) str # => "a,b,'''c''',d\n" ary = CSV.parse_line(str, quote_char: quote_char) ary # => [["a", "b", "'c'", "d"]]
Raises an exception if the String length is greater than 1:
# Raises ArgumentError (:quote_char has to be nil or a single character String) CSV.new('', quote_char: 'xx')
Option write_headers¶ ↑
Specifies the boolean that determines whether a header row is included in the output; ignored if there are no headers.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:write_headers) # => nil
Without write_headers:
file_path = 't.csv' CSV.open(file_path,'w', :headers => ['Name','Value'] ) do |csv| csv << ['foo', '0'] end CSV.open(file_path) do |csv| csv.shift end # => ["foo", "0"]
With write_headers“:
CSV.open(file_path,'w', :write_headers=> true, :headers => ['Name','Value'] ) do |csv| csv << ['foo', '0'] end CSV.open(file_path) do |csv| csv.shift end # => ["Name", "Value"]
Option force_quotes¶ ↑
Specifies the boolean that determines whether each output field is to be double-quoted.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:force_quotes) # => false
For examples in this section:
ary = ['foo', 0, nil]
Using the default, false:
str = CSV.generate_line(ary) str # => "foo,0,\n"
Using true:
str = CSV.generate_line(ary, force_quotes: true) str # => "\"foo\",\"0\",\"\"\n"
Option quote_empty¶ ↑
Specifies the boolean that determines whether an empty value is to be double-quoted.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:quote_empty) # => true
With the default true:
CSV.generate_line(['"', ""]) # => "\"\"\"\",\"\"\n"
With false:
CSV.generate_line(['"', ""], quote_empty: false) # => "\"\"\"\",\n"
Option write_converters¶ ↑
Specifies the Proc or Array of Procs that are to be called for converting each output field.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:write_converters) # => nil
With no write converter:
str = CSV.generate_line(["\na\n", "\tb\t", " c "]) str # => "\"\na\n\",\tb\t, c \n"
With a write converter:
strip_converter = lambda {|field| field.strip } str = CSV.generate_line(["\na\n", "\tb\t", " c "], write_converters: strip_converter) str # => "a,b,c\n"
With two write converters (called in order):
upcase_converter = lambda {|field| field.upcase } downcase_converter = lambda {|field| field.downcase } write_converters = [upcase_converter, downcase_converter] str = CSV.generate_line(['a', 'b', 'c'], write_converters: write_converters) str # => "a,b,c\n"
Raises an exception if the converter returns a value that is neither nil nor String-convertible:
bad_converter = lambda {|field| BasicObject.new } # Raises NoMethodError (undefined method `is_a?' for #<BasicObject:>) CSV.generate_line(['a', 'b', 'c'], write_converters: bad_converter)#
Option write_nil_value¶ ↑
Specifies the object that is to be substituted for each nil field.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:write_nil_value) # => nil
Without the option:
str = CSV.generate_line(['a', nil, 'c', nil]) str # => "a,,c,\n"
With the option:
str = CSV.generate_line(['a', nil, 'c', nil], write_nil_value: "x") str # => "a,x,c,x\n"
Option write_empty_value¶ ↑
Specifies the object that is to be substituted for each field that has an empty String.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:write_empty_value) # => ""
Without the option:
str = CSV.generate_line(['a', '', 'c', '']) str # => "a,\"\",c,\"\"\n"
With the option:
str = CSV.generate_line(['a', '', 'c', ''], write_empty_value: "x") str # => "a,x,c,x\n"
Option strip¶ ↑
Specifies the boolean value that determines whether whitespace is stripped from each input field.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:strip) # => false
With default value false:
ary = CSV.parse_line(' a , b ') ary # => [" a ", " b "]
With value true:
ary = CSV.parse_line(' a , b ', strip: true) ary # => ["a", "b"]
CSV with headers¶ ↑
CSV allows to specify column names of CSV file, whether they are in data, or provided separately. If headers are specified, reading methods return an instance of CSV::Table, consisting of CSV::Row.
# Headers are part of data data = CSV.parse(<<~ROWS, headers: true) Name,Department,Salary Bob,Engineering,1000 Jane,Sales,2000 John,Management,5000 ROWS data.class #=> CSV::Table data.first #=> #<CSV::Row "Name":"Bob" "Department":"Engineering" "Salary":"1000"> data.first.to_h #=> {"Name"=>"Bob", "Department"=>"Engineering", "Salary"=>"1000"} # Headers provided by developer data = CSV.parse('Bob,Engineering,1000', headers: %i[name department salary]) data.first #=> #<CSV::Row name:"Bob" department:"Engineering" salary:"1000">
CSV Converters¶ ↑
By default, each field parsed by CSV is formed into a String. You can use a converter to convert certain fields into other Ruby objects.
When you specify a converter for parsing, each parsed field is passed to the converter; its return value becomes the new value for the field. A converter might, for example, convert an integer embedded in a String into a true Integer. (In fact, that’s what built-in field converter :integer does.)
There are additional built-in converters, and custom converters are also supported.
All converters try to transcode fields to UTF-8 before converting. The conversion will fail if the data cannot be transcoded, leaving the field unchanged.
Field Converters¶ ↑
There are three ways to use field converters; these examples use built-in field converter :integer, which converts each parsed integer string to a true Integer.
Option converters with a singleton parsing method:
ary = CSV.parse_line('0,1,2', converters: :integer) ary # => [0, 1, 2]
Option converters with a new CSV instance:
csv = CSV.new('0,1,2', converters: :integer) # Field converters in effect: csv.converters # => [:integer] csv.shift # => [0, 1, 2]
Method convert adds a field converter to a CSV instance:
csv = CSV.new('0,1,2') # Add a converter. csv.convert(:integer) csv.converters # => [:integer] csv.shift # => [0, 1, 2]
The built-in field converters are in Hash CSV::Converters. The Symbol keys there are the names of the converters:
CSV::Converters.keys # => [:integer, :float, :numeric, :date, :date_time, :all]
Converter :integer converts each field that +Integer()+ accepts:
data = '0,1,2,x' # Without the converter csv = CSV.parse_line(data) csv # => ["0", "1", "2", "x"] # With the converter csv = CSV.parse_line(data, converters: :integer) csv # => [0, 1, 2, "x"]
Converter :float converts each field that +Float()+ accepts:
data = '1.0,3.14159,x' # Without the converter csv = CSV.parse_line(data) csv # => ["1.0", "3.14159", "x"] # With the converter csv = CSV.parse_line(data, converters: :float) csv # => [1.0, 3.14159, "x"]
Converter :numeric converts with both :integer and :float..
Converter :date converts each field that +Date::parse()+ accepts:
data = '2001-02-03,x' # Without the converter csv = CSV.parse_line(data) csv # => ["2001-02-03", "x"] # With the converter csv = CSV.parse_line(data, converters: :date) csv # => [#<Date: 2001-02-03 ((2451944j,0s,0n),+0s,2299161j)>, "x"]
Converter :date_time converts each field that +DateTime::parse() accepts:
data = '2020-05-07T14:59:00-05:00,x' # Without the converter csv = CSV.parse_line(data) csv # => ["2020-05-07T14:59:00-05:00", "x"] # With the converter csv = CSV.parse_line(data, converters: :date_time) csv # => [#<DateTime: 2020-05-07T14:59:00-05:00 ((2458977j,71940s,0n),-18000s,2299161j)>, "x"]
Converter :numeric converts with both :date_time and :numeric..
As seen above, method convert adds converters to a CSV instance, and method converters returns an Array of the converters in effect:
csv = CSV.new('0,1,2') csv.converters # => [] csv.convert(:integer) csv.converters # => [:integer] csv.convert(:date) csv.converters # => [:integer, :date]
You can add a custom field converter to Hash CSV::Converters:
strip_converter = proc {|field| field.strip} CSV::Converters[:strip] = strip_converter CSV::Converters.keys # => [:integer, :float, :numeric, :date, :date_time, :all, :strip]
Then use it to convert fields:
str = ' foo , 0 ' ary = CSV.parse_line(str, converters: :strip) ary # => ["foo", "0"]
See Custom Converters.
Header Converters¶ ↑
Header converters operate only on headers (and not on other rows).
There are three ways to use header converters; these examples use built-in header converter :dowhcase, which downcases each parsed header.
Option header_converters with a singleton parsing method:
str = "Name,Count\nFoo,0\n,Bar,1\nBaz,2" tbl = CSV.parse(str, headers: true, header_converters: :downcase) tbl.class # => CSV::Table tbl.headers # => ["name", "count"]
Option header_converters with a new CSV instance:
csv = CSV.new(str, header_converters: :downcase) # Header converters in effect: csv.header_converters # => [:downcase] tbl = CSV.parse(str, headers: true) tbl.headers # => ["Name", "Count"]
Method header_convert adds a header converter to a CSV instance:
csv = CSV.new(str) # Add a header converter. csv.header_convert(:downcase) csv.header_converters # => [:downcase] tbl = CSV.parse(str, headers: true) tbl.headers # => ["Name", "Count"]
The built-in header converters are in Hash CSV::Converters. The Symbol keys there are the names of the converters:
CSV::HeaderConverters.keys # => [:downcase, :symbol]
Converter :downcase converts each header by downcasing it:
str = "Name,Count\nFoo,0\n,Bar,1\nBaz,2" tbl = CSV.parse(str, headers: true, header_converters: :downcase) tbl.class # => CSV::Table tbl.headers # => ["name", "count"]
Converter :symbol by making it into a Symbol:
str = "Name,Count\nFoo,0\n,Bar,1\nBaz,2" tbl = CSV.parse(str, headers: true, header_converters: :symbol) tbl.headers # => [:name, :count]
Details:
-
Strips leading and trailing whitespace.
-
Downcases the header.
-
Replaces embedded spaces with underscores.
-
Removes non-word characters.
-
Makes the string into a Symbol.
You can add a custom header converter to Hash CSV::HeaderConverters:
strip_converter = proc {|field| field.strip} CSV::HeaderConverters[:strip] = strip_converter CSV::HeaderConverters.keys # => [:downcase, :symbol, :strip]
Then use it to convert headers:
str = " Name , Value \nfoo,0\nbar,1\nbaz,2" tbl = CSV.parse(str, headers: true, header_converters: :strip) tbl.headers # => ["Name", "Value"]
See Custom Converters.
Custom Converters¶ ↑
You can define custom converters.
The converter is a Proc that is called with two arguments, String field and CSV::FieldInfo field_info; it returns a String that will become the field value:
converter = proc {|field, field_info| <some_string> }
To illustrate:
converter = proc {|field, field_info| p [field, field_info]; field} ary = CSV.parse_line('foo,0', converters: converter)
Produces:
["foo", #<struct CSV::FieldInfo index=0, line=1, header=nil>] ["0", #<struct CSV::FieldInfo index=1, line=1, header=nil>]
In each of the output lines:
-
The first Array element is the passed String field.
-
The second is a FieldInfo structure containing information about the field:
-
The 0-based column index.
-
The 1-based line number.
-
The header for the column, if available.
-
If the converter does not need field_info, it can be omitted:
converter = proc {|field| ... }
CSV and Character Encodings (M17n or Multilingualization)¶ ↑
This new CSV parser is m17n savvy. The parser works in the Encoding of the IO or String object being read from or written to. Your data is never transcoded (unless you ask Ruby to transcode it for you) and will literally be parsed in the Encoding it is in. Thus CSV will return Arrays or Rows of Strings in the Encoding of your data. This is accomplished by transcoding the parser itself into your Encoding.
Some transcoding must take place, of course, to accomplish this multiencoding support. For example, :col_sep, :row_sep, and :quote_char must be transcoded to match your data. Hopefully this makes the entire process feel transparent, since CSV’s defaults should just magically work for your data. However, you can set these values manually in the target Encoding to avoid the translation.
It’s also important to note that while all of CSV’s core parser is now Encoding agnostic, some features are not. For example, the built-in converters will try to transcode data to UTF-8 before making conversions. Again, you can provide custom converters that are aware of your Encodings to avoid this translation. It’s just too hard for me to support native conversions in all of Ruby’s Encodings.
Anyway, the practical side of this is simple: make sure IO and String objects passed into CSV have the proper Encoding set and everything should just work. CSV methods that allow you to open IO objects (CSV::foreach(), CSV::open(), CSV::read(), and CSV::readlines()) do allow you to specify the Encoding.
One minor exception comes when generating CSV into a String with an Encoding that is not ASCII compatible. There’s no existing data for CSV to use to prepare itself and thus you will probably need to manually specify the desired Encoding for most of those cases. It will try to guess using the fields in a row of output though, when using CSV::generate_line() or Array#to_csv().
I try to point out any other Encoding issues in the documentation of methods as they come up.
This has been tested to the best of my ability with all non-“dummy” Encodings Ruby ships with. However, it is brave new code and may have some bugs. Please feel free to report any issues you find with it.
Constants
- ConverterEncoding
The encoding used by all converters.
- Converters
This Hash holds the built-in converters of
CSVthat can be accessed by name. You can selectConverterswithCSV.convert()or through theoptionsHash passed toCSV::new().:integer-
Converts any field Integer() accepts.
:float-
Converts any field Float() accepts.
:numeric-
A combination of
:integerand:float. :date-
Converts any field Date::parse() accepts.
:date_time-
Converts any field DateTime::parse() accepts.
:all-
All built-in converters. A combination of
:date_timeand:numeric.
All built-in converters transcode field data to UTF-8 before attempting a conversion. If your data cannot be transcoded to UTF-8 the conversion will fail and the field will remain unchanged.
This Hash is intentionally left unfrozen and users should feel free to add values to it that can be accessed by all
CSVobjects.To add a combo field, the value should be an
Arrayof names. Combo fields can be nested with other combo fields.- DEFAULT_OPTIONS
Default values for method options.
- DateMatcher
A Regexp used to find and convert some common Date formats.
- DateTimeMatcher
A Regexp used to find and convert some common DateTime formats.
- FieldInfo
A
FieldInfoStruct contains details about a field’s position in the data source it was read from.CSVwill pass this Struct to some blocks that make decisions based on field structure. SeeCSV.convert_fields()for an example.index-
The zero-based index of the field in its row.
line-
The line of the data source this row is from.
header-
The header for the column, when available.
- HeaderConverters
This Hash holds the built-in header converters of
CSVthat can be accessed by name. You can selectHeaderConverterswithCSV.header_convert()or through theoptionsHash passed toCSV::new().:downcase-
Calls downcase() on the header
String. :symbol-
Leading/trailing spaces are dropped, string is downcased, remaining spaces are replaced with underscores, non-word characters are dropped, and finally to_sym() is called.
All built-in header converters transcode header data to UTF-8 before attempting a conversion. If your data cannot be transcoded to UTF-8 the conversion will fail and the header will remain unchanged.
This Hash is intentionally left unfrozen and users should feel free to add values to it that can be accessed by all
CSVobjects.To add a combo field, the value should be an
Arrayof names. Combo fields can be nested with other combo fields.- VERSION
The version of the installed library.
Attributes
The Encoding CSV is parsing or writing in. This will be the Encoding you receive parsed data in and/or the Encoding data will be written in.
Public Class Methods
This method is a convenience for building Unix-like filters for CSV data. Each row is yielded to the provided block which can alter it as needed. After the block returns, the row is appended to output altered or not.
The input and output arguments can be anything CSV::new() accepts (generally String or IO objects). If not given, they default to ARGF and $stdout.
The options parameter is also filtered down to CSV::new() after some clever key parsing. Any key beginning with :in_ or :input_ will have that leading identifier stripped and will only be used in the options Hash for the input object. Keys starting with :out_ or :output_ affect only output. All other keys are assigned to both objects.
See Options for Parsing and Options for Generating.
The :output_row_sep option defaults to $INPUT_RECORD_SEPARATOR ($/).
# File lib/csv.rb, line 729 def filter(input=nil, output=nil, **options) # parse options for input, output, or both in_options, out_options = Hash.new, {row_sep: $INPUT_RECORD_SEPARATOR} options.each do |key, value| case key.to_s when /\Ain(?:put)?_(.+)\Z/ in_options[$1.to_sym] = value when /\Aout(?:put)?_(.+)\Z/ out_options[$1.to_sym] = value else in_options[key] = value out_options[key] = value end end # build input and output wrappers input = new(input || ARGF, **in_options) output = new(output || $stdout, **out_options) # read, yield, write input.each do |row| yield row output << row end end
This method is intended as the primary interface for reading CSV files. You pass a path and any options you wish to set for the read. Each row of file will be passed to the provided block in turn.
See Options for Parsing.
The options parameter can be anything CSV::new() understands. This method also understands an additional :encoding parameter that you can use to specify the Encoding of the data in the file to be read. You must provide this unless your data is in Encoding::default_external(). CSV will use this to determine how to parse the data. You may provide a second Encoding to have the data transcoded as it is read. For example, encoding: "UTF-32BE:UTF-8" would read UTF-32BE data from the file but transcode it to UTF-8 before CSV parses it.
# File lib/csv.rb, line 770 def foreach(path, mode="r", **options, &block) return to_enum(__method__, path, mode, **options) unless block_given? open(path, mode, **options) do |csv| csv.each(&block) end end
This method wraps a String you provide, or an empty default String, in a CSV object which is passed to the provided block. You can use the block to append CSV rows to the String and when the block exits, the final String will be returned.
Note that a passed String is modified by this method. Call dup() before passing if you need a new String.
This method has one additional option: :encoding, which sets the base Encoding for the output if no no str is specified. CSV needs this hint if you plan to output non-ASCII compatible data.
# File lib/csv.rb, line 796 def generate(str=nil, **options) encoding = options[:encoding] # add a default empty String, if none was given if str str = StringIO.new(str) str.seek(0, IO::SEEK_END) str.set_encoding(encoding) if encoding else str = +"" str.force_encoding(encoding) if encoding end csv = new(str, **options) # wrap yield csv # yield for appending csv.string # return final String end
Returns the String created by generating CSV from ary using the specified options.
Argument ary must be an Array.
Special options:
-
Option
:row_sepdefaults to$INPUT_RECORD_SEPARATOR($/).:$INPUT_RECORD_SEPARATOR # => "\n"
-
This method accepts an additional option,
:encoding, which sets the base Encoding for the output. This method will try to guess your Encoding from the first non-nilfield inrow, if possible, but you may need to use this parameter as a backup plan.
For other options, see Options for Generating.
Returns the String generated from an Array:
CSV.generate_line(['foo', '0']) # => "foo,0\n"
Raises an exception if ary is not an Array:
# Raises NoMethodError (undefined method `find' for :foo:Symbol) CSV.generate_line(:foo)
# File lib/csv.rb, line 844 def generate_line(row, **options) options = {row_sep: $INPUT_RECORD_SEPARATOR}.merge(options) str = +"" if options[:encoding] str.force_encoding(options[:encoding]) elsif field = row.find {|f| f.is_a?(String)} str.force_encoding(field.encoding) end (new(str, **options) << row).string end
This method will return a CSV instance, just like CSV::new(), but the instance will be cached and returned for all future calls to this method for the same data object (tested by Object#object_id()) with the same options.
See Options for Parsing and Options for Generating.
If a block is given, the instance is passed to the block and the return value becomes the return value of the block.
# File lib/csv.rb, line 686 def instance(data = $stdout, **options) # create a _signature_ for this method call, data object and options sig = [data.object_id] + options.values_at(*DEFAULT_OPTIONS.keys.sort_by { |sym| sym.to_s }) # fetch or create the instance for this signature @@instances ||= Hash.new instance = (@@instances[sig] ||= new(data, **options)) if block_given? yield instance # run block, if given, returning result else instance # or return the instance end end
Returns the new CSV object created using string or io and the specified options.
Argument string should be a String object; it will be put into a new StringIO object positioned at the beginning.
Argument io should be an IO object; it will be positioned at the beginning.
To position at the end, for appending, use method CSV.generate. For any other positioning, pass a preset StringIO object instead.
In addition to the CSV instance methods, several IO methods are delegated. See CSV::open for a complete list.
For options, see:
For performance reasons, the options cannot be overridden in a CSV object, so the options specified here will endure.
Create a CSV object from a String object:
csv = CSV.new('foo,0') csv # => #<CSV io_type:StringIO encoding:UTF-8 lineno:0 col_sep:"," row_sep:"\n" quote_char:"\"">
Create a CSV object from a File object:
File.write('t.csv', 'foo,0') csv = CSV.new(File.open('t.csv')) csv # => #<CSV io_type:File io_path:"t.csv" encoding:UTF-8 lineno:0 col_sep:"," row_sep:"\n" quote_char:"\"">
Raises an exception if the argument is nil:
# Raises ArgumentError (Cannot parse nil as CSV): CSV.new(nil)
# File lib/csv.rb, line 1102 def initialize(data, col_sep: ",", row_sep: :auto, quote_char: '"', field_size_limit: nil, converters: nil, unconverted_fields: nil, headers: false, return_headers: false, write_headers: nil, header_converters: nil, skip_blanks: false, force_quotes: false, skip_lines: nil, liberal_parsing: false, internal_encoding: nil, external_encoding: nil, encoding: nil, nil_value: nil, empty_value: "", quote_empty: true, write_converters: nil, write_nil_value: nil, write_empty_value: "", strip: false) raise ArgumentError.new("Cannot parse nil as CSV") if data.nil? if data.is_a?(String) @io = StringIO.new(data) @io.set_encoding(encoding || data.encoding) else @io = data end @encoding = determine_encoding(encoding, internal_encoding) @base_fields_converter_options = { nil_value: nil_value, empty_value: empty_value, } @write_fields_converter_options = { nil_value: write_nil_value, empty_value: write_empty_value, } @initial_converters = converters @initial_header_converters = header_converters @initial_write_converters = write_converters @parser_options = { column_separator: col_sep, row_separator: row_sep, quote_character: quote_char, field_size_limit: field_size_limit, unconverted_fields: unconverted_fields, headers: headers, return_headers: return_headers, skip_blanks: skip_blanks, skip_lines: skip_lines, liberal_parsing: liberal_parsing, encoding: @encoding, nil_value: nil_value, empty_value: empty_value, strip: strip, } @parser = nil @parser_enumerator = nil @eof_error = nil @writer_options = { encoding: @encoding, force_encoding: (not encoding.nil?), force_quotes: force_quotes, headers: headers, write_headers: write_headers, column_separator: col_sep, row_separator: row_sep, quote_character: quote_char, quote_empty: quote_empty, } @writer = nil writer if @writer_options[:write_headers] end
This method opens an IO object, and wraps that with CSV. This is intended as the primary interface for writing a CSV file.
You must pass a filename and may optionally add a mode for Ruby’s open().
This method works like Ruby’s open() call, in that it will pass a CSV object to a provided block and close it when the block terminates, or it will return the CSV object when no block is provided. (Note: This is different from the Ruby 1.8 CSV library which passed rows to the block. Use CSV::foreach() for that behavior.)
You must provide a mode with an embedded Encoding designator unless your data is in Encoding::default_external(). CSV will check the Encoding of the underlying IO object (set by the mode you pass) to determine how to parse the data. You may provide a second Encoding to have the data transcoded as it is read just as you can with a normal call to IO::open(). For example, "rb:UTF-32BE:UTF-8" would read UTF-32BE data from the file but transcode it to UTF-8 before CSV parses it.
An opened CSV object will delegate to many IO methods for convenience. You may call:
-
binmode()
-
binmode?()
-
close()
-
close_read()
-
close_write()
-
closed?()
-
eof()
-
eof?()
-
external_encoding()
-
fcntl()
-
fileno()
-
flock()
-
flush()
-
fsync()
-
internal_encoding()
-
ioctl()
-
isatty()
-
path()
-
pid()
-
pos()
-
pos=()
-
reopen()
-
seek()
-
stat()
-
sync()
-
sync=()
-
tell()
-
to_i() -
to_io() -
truncate()
-
tty?()
# File lib/csv.rb, line 919 def open(filename, mode="r", **options) # wrap a File opened with the remaining +args+ with no newline # decorator file_opts = {universal_newline: false}.merge(options) begin f = File.open(filename, mode, **file_opts) rescue ArgumentError => e raise unless /needs binmode/.match?(e.message) and mode == "r" mode = "rb" file_opts = {encoding: Encoding.default_external}.merge(file_opts) retry end begin csv = new(f, **options) rescue Exception f.close raise end # handle blocks like Ruby's open(), not like the CSV library if block_given? begin yield csv ensure csv.close end else csv end end
This method can be used to easily parse CSV out of a String. You may either provide a block which will be called with each row of the String in turn, or just use the returned Array of Arrays (when no block is given).
You pass your str to read from, and an optional options. See Options for Parsing.
# File lib/csv.rb, line 963 def parse(str, **options, &block) csv = new(str, **options) return csv.each(&block) if block_given? # slurp contents, if no block is given begin csv.read ensure csv.close end end
Returns the new Array created by parsing the first line of string or io using the specified options.
Argument string should be a String object; it will be put into a new StringIO object positioned at the beginning.
Argument io should be an IO object; it will be positioned at the beginning.
For options, see Options for Parsing.
Returns data from the first line from a String object:
CSV.parse_line('foo,0') # => ["foo", "0"]
Returns data from the first line from a File object:
File.write('t.csv', 'foo,0') CSV.parse_line(File.open('t.csv')) # => ["foo", "0"]
Ignores lines after the first:
CSV.parse_line("foo,0\nbar,1\nbaz,2") # => ["foo", "0"]
Returns nil if the argument is an empty String:
CSV.parse_line('') # => nil
Raises an exception if the argument is nil:
# Raises ArgumentError (Cannot parse nil as CSV): CSV.parse_line(nil)
# File lib/csv.rb, line 1012 def parse_line(line, **options) new(line, **options).each.first end
Use to slurp a CSV file into an Array of Arrays. Pass the path to the file and options. See Options for Parsing.
This method also understands an additional :encoding parameter that you can use to specify the Encoding of the data in the file to be read. You must provide this unless your data is in Encoding::default_external(). CSV will use this to determine how to parse the data. You may provide a second Encoding to have the data transcoded as it is read. For example, encoding: "UTF-32BE:UTF-8" would read UTF-32BE data from the file but transcode it to UTF-8 before CSV parses it.
# File lib/csv.rb, line 1030 def read(path, **options) open(path, **options) { |csv| csv.read } end
Alias for CSV::read().
# File lib/csv.rb, line 1035 def readlines(path, **options) read(path, **options) end
A shortcut for:
CSV.read( path, { headers: true, converters: :numeric, header_converters: :symbol }.merge(options) )
See Options for Parsing.
# File lib/csv.rb, line 1047 def table(path, **options) default_options = { headers: true, converters: :numeric, header_converters: :symbol, } options = default_options.merge(options) read(path, **options) end
Public Instance Methods
The primary write method for wrapped Strings and IOs, row (an Array or CSV::Row) is converted to CSV and appended to the data source. When a CSV::Row is passed, only the row’s fields() are appended to the output.
The data source must be open for writing.
# File lib/csv.rb, line 1410 def <<(row) writer << row self end
# File lib/csv.rb, line 1342 def binmode? if @io.respond_to?(:binmode?) @io.binmode? else false end end
The encoded :col_sep used in parsing and writing. See CSV::new for details.
# File lib/csv.rb, line 1189 def col_sep parser.column_separator end
You can use this method to install a CSV::Converters built-in, or provide a block that handles a custom conversion.
If you provide a block that takes one argument, it will be passed the field and is expected to return the converted value or the field itself. If your block takes two arguments, it will also be passed a CSV::FieldInfo Struct, containing details about the field. Again, the block should return a converted field or the field itself.
# File lib/csv.rb, line 1432 def convert(name = nil, &converter) parser_fields_converter.add_converter(name, &converter) end
Returns the current list of converters in effect. See CSV::new for details. Built-in converters will be returned by name, while others will be returned as is.
# File lib/csv.rb, line 1230 def converters parser_fields_converter.map do |converter| name = Converters.rassoc(converter) name ? name.first : converter end end
Yields each row of the data source in turn.
Support for Enumerable.
The data source must be open for reading.
# File lib/csv.rb, line 1460 def each(&block) parser_enumerator.each(&block) end
# File lib/csv.rb, line 1378 def eof? return false if @eof_error begin parser_enumerator.peek false rescue MalformedCSVError => error @eof_error = error false rescue StopIteration true end end
The limit for field size, if any. See CSV::new for details.
# File lib/csv.rb, line 1213 def field_size_limit parser.field_size_limit end
# File lib/csv.rb, line 1350 def flock(*args) raise NotImplementedError unless @io.respond_to?(:flock) @io.flock(*args) end
Returns true if all output fields are quoted. See CSV::new for details.
# File lib/csv.rb, line 1298 def force_quotes? @writer_options[:force_quotes] end
Identical to CSV#convert(), but for header rows.
Note that this method must be called before header rows are read to have any effect.
# File lib/csv.rb, line 1447 def header_convert(name = nil, &converter) header_fields_converter.add_converter(name, &converter) end
Returns the current list of converters in effect for headers. See CSV::new for details. Built-in converters will be returned by name, while others will be returned as is.
# File lib/csv.rb, line 1282 def header_converters header_fields_converter.map do |converter| name = HeaderConverters.rassoc(converter) name ? name.first : converter end end
Returns true if the next row read will be a header row.
# File lib/csv.rb, line 1480 def header_row? parser.header_row? end
Returns nil if headers will not be used, true if they will but have not yet been read, or the actual headers after they have been read. See CSV::new for details.
# File lib/csv.rb, line 1250 def headers if @writer @writer.headers else parsed_headers = parser.headers return parsed_headers if parsed_headers raw_headers = @parser_options[:headers] raw_headers = nil if raw_headers == false raw_headers end end
Returns a simplified description of the key CSV attributes in an ASCII compatible String.
# File lib/csv.rb, line 1509 def inspect str = ["#<", self.class.to_s, " io_type:"] # show type of wrapped IO if @io == $stdout then str << "$stdout" elsif @io == $stdin then str << "$stdin" elsif @io == $stderr then str << "$stderr" else str << @io.class.to_s end # show IO.path(), if available if @io.respond_to?(:path) and (p = @io.path) str << " io_path:" << p.inspect end # show encoding str << " encoding:" << @encoding.name # show other attributes ["lineno", "col_sep", "row_sep", "quote_char"].each do |attr_name| if a = __send__(attr_name) str << " " << attr_name << ":" << a.inspect end end ["skip_blanks", "liberal_parsing"].each do |attr_name| if a = __send__("#{attr_name}?") str << " " << attr_name << ":" << a.inspect end end _headers = headers str << " headers:" << _headers.inspect if _headers str << ">" begin str.join('') rescue # any encoding error str.map do |s| e = Encoding::Converter.asciicompat_encoding(s.encoding) e ? s.encode(e) : s.force_encoding("ASCII-8BIT") end.join('') end end
# File lib/csv.rb, line 1355 def ioctl(*args) raise NotImplementedError unless @io.respond_to?(:ioctl) @io.ioctl(*args) end
Returns true if illegal input is handled. See CSV::new for details.
# File lib/csv.rb, line 1303 def liberal_parsing? parser.liberal_parsing? end
The last row read from this file.
# File lib/csv.rb, line 1328 def line parser.line end
The line number of the last row read from this file. Fields with nested line-end characters will not affect this count.
# File lib/csv.rb, line 1317 def lineno if @writer @writer.lineno else parser.lineno end end
# File lib/csv.rb, line 1360 def path @io.path if @io.respond_to?(:path) end
The encoded :quote_char used in parsing and writing. See CSV::new for details.
# File lib/csv.rb, line 1205 def quote_char parser.quote_character end
Slurps the remaining rows and returns an Array of Arrays.
The data source must be open for reading.
# File lib/csv.rb, line 1469 def read rows = to_a if parser.use_headers? Table.new(rows, headers: parser.headers) else rows end end
Returns true if headers will be returned as a row of results. See CSV::new for details.
# File lib/csv.rb, line 1265 def return_headers? parser.return_headers? end
Rewinds the underlying IO object and resets CSV’s lineno() counter.
# File lib/csv.rb, line 1393 def rewind @parser = nil @parser_enumerator = nil @eof_error = nil @writer.rewind if @writer @io.rewind end
The encoded :row_sep used in parsing and writing. See CSV::new for details.
# File lib/csv.rb, line 1197 def row_sep parser.row_separator end
The primary read method for wrapped Strings and IOs, a single row is pulled from the data source, parsed and returned as an Array of fields (if header rows are not used) or a CSV::Row (when header rows are used).
The data source must be open for reading.
# File lib/csv.rb, line 1491 def shift if @eof_error eof_error, @eof_error = @eof_error, nil raise eof_error end begin parser_enumerator.next rescue StopIteration nil end end
Returns true blank lines are skipped by the parser. See CSV::new for details.
# File lib/csv.rb, line 1293 def skip_blanks? parser.skip_blanks? end
The regex marking a line as a comment. See CSV::new for details.
# File lib/csv.rb, line 1221 def skip_lines parser.skip_lines end
# File lib/csv.rb, line 1364 def stat(*args) raise NotImplementedError unless @io.respond_to?(:stat) @io.stat(*args) end
# File lib/csv.rb, line 1369 def to_i raise NotImplementedError unless @io.respond_to?(:to_i) @io.to_i end
# File lib/csv.rb, line 1374 def to_io @io.respond_to?(:to_io) ? @io.to_io : @io end
Returns true if unconverted_fields() to parsed results. See CSV::new for details.
# File lib/csv.rb, line 1241 def unconverted_fields? parser.unconverted_fields? end
Returns true if headers are written in output. See CSV::new for details.
# File lib/csv.rb, line 1273 def write_headers? @writer_options[:write_headers] end
Private Instance Methods
# File lib/csv.rb, line 1641 def build_fields_converter(initial_converters, options) fields_converter = FieldsConverter.new(options) normalize_converters(initial_converters).each do |name, converter| fields_converter.add_converter(name, &converter) end fields_converter end
# File lib/csv.rb, line 1623 def build_header_fields_converter specific_options = { builtin_converters: HeaderConverters, accept_nil: true, } options = @base_fields_converter_options.merge(specific_options) build_fields_converter(@initial_header_converters, options) end
# File lib/csv.rb, line 1611 def build_parser_fields_converter specific_options = { builtin_converters: Converters, } options = @base_fields_converter_options.merge(specific_options) build_fields_converter(@initial_converters, options) end
# File lib/csv.rb, line 1636 def build_writer_fields_converter build_fields_converter(@initial_write_converters, @write_fields_converter_options) end
Processes fields with @converters, or @header_converters if headers is passed as true, returning the converted field set. Any converter that changes the field into something other than a String halts the pipeline of conversion for that field. This is primarily an efficiency shortcut.
# File lib/csv.rb, line 1586 def convert_fields(fields, headers = false) if headers header_fields_converter.convert(fields, nil, 0) else parser_fields_converter.convert(fields, @headers, lineno) end end
# File lib/csv.rb, line 1549 def determine_encoding(encoding, internal_encoding) # honor the IO encoding if we can, otherwise default to ASCII-8BIT io_encoding = raw_encoding return io_encoding if io_encoding return Encoding.find(internal_encoding) if internal_encoding if encoding encoding, = encoding.split(":", 2) if encoding.is_a?(String) return Encoding.find(encoding) end Encoding.default_internal || Encoding.default_external end
# File lib/csv.rb, line 1619 def header_fields_converter @header_fields_converter ||= build_header_fields_converter end
# File lib/csv.rb, line 1564 def normalize_converters(converters) converters ||= [] unless converters.is_a?(Array) converters = [converters] end converters.collect do |converter| case converter when Proc # custom code block [nil, converter] else # by name [converter, nil] end end end
# File lib/csv.rb, line 1649 def parser @parser ||= Parser.new(@io, parser_options) end
# File lib/csv.rb, line 1658 def parser_enumerator @parser_enumerator ||= parser.parse end
# File lib/csv.rb, line 1607 def parser_fields_converter @parser_fields_converter ||= build_parser_fields_converter end
# File lib/csv.rb, line 1653 def parser_options @parser_options.merge(header_fields_converter: header_fields_converter, fields_converter: parser_fields_converter) end
Returns the encoding of the internal IO object.
# File lib/csv.rb, line 1597 def raw_encoding if @io.respond_to? :internal_encoding @io.internal_encoding || @io.external_encoding elsif @io.respond_to? :encoding @io.encoding else nil end end
# File lib/csv.rb, line 1662 def writer @writer ||= Writer.new(@io, writer_options) end
# File lib/csv.rb, line 1632 def writer_fields_converter @writer_fields_converter ||= build_writer_fields_converter end
# File lib/csv.rb, line 1666 def writer_options @writer_options.merge(header_fields_converter: header_fields_converter, fields_converter: writer_fields_converter) end