I am working with some code that processes CSV files. Each row corresponds to an existing record and the record is updated in response to the column values. This is not an uncommon task. The existing code implements this in an also not uncommon way by intermixing row parsing and record updating. For example, assume we are updating Foos
class Foo attr id, :location, :valuation # ... endA typical intermixing is
row = [...] raise "unusable foo id" if row[0].blank? foo = Foo.find(row[0].to_i) raise "foo not found with id #{row[0]}" unless foo raise "unusable town location" if row[1].blank? location = Location.find_by(town: row[1]) raise "location not found with town #{row[1]}" unless location; foo.location = location raise "unusable valuation" unless values[2].to_i < 10_000 foo.valulation = values[2].to_iWhile this initially seems like a reasonable approach it quickly breaks down as the number of columns increase, column format is non-trival, and there are column (or row) interdependencies. But the more significant problem is that the parsing and the updating can't be tested individually. This makes the test harder to write, understand, and maintain.
It is always better to first parse the raw data, validate it, and then use it. Eg
class Record attr :id, :town, :valuation attr :foo, :location def initialize(values) raise "unusable foo id" unless /^\s*(\d+)\s*$/ =~ values[0] id = $1.to_i raise "unusable town location" unless /^\s*(.+)\s*$/ =~ values[1] location = $1 raise "unusable valuation" unless /^\s*(\d+)\s*$/ =~ values[2] valuation = $1.to_i end def validate foo = Foo.find(id) raise "foo not found with id #{id}" unless foo location = Location.find_by(town:) raise "location not found with town #{town}" unless location raise "valuation does not match the minimum" unless valuation >= 10_000; end end # read the raw data rows = [[...], ...] # parse and validate the data records = rows.map do |row| record = Record.new(row) record.validate end # use the data records.each do |record| record.foo.location = record.location record.foo.valuation = record.valuation record.foo.save endThis parse, validate, and use approach is approporate for all cases where you are bringing data from the outside into your application, no matter what the outside source.
ps. These small, helper classes are your friends. Prefer them over your language's hash primitive as they provide great control. Most languages have efficient syntax for creating and using them.