First let me say that I am an absolute Newbie to Ruby. So please be
tolerant of my newbie question.
My situation is this. I am gathering financial data, and am about to
change data suppliers. I want to "merge" the files from both suppliers
to have as much data history as possible. I have the data in ASCII
format in a comma delimited file.
I have the data in the following structre:
c:\data\1Original\abc.csv - a new data file
c:\data\2Processed\abc.csb - the historical file and my processing
reference
Each file has the same file structure of:
Symbol, Date, Open, High, Low, Close, Volume
I already have a process that references the files in the
c:\data\processed directory structure.
Currently I have figured out how to walk the directory tree and copy any
NEW files into the Processed directory. I am hung up on the merging of
the files into the processed directory.
Sample files to demonstrate:
c:\data\1Original\abc.csv (new data)
abc, 20060901, 1.5, 2.1, 1.4, 1.9, 123456
abc, 20060902. 1.9, 2.3, 1.8, 2.3, 147454
c:\data\2Processed\abc.csv (historical)
abc, 20010101, 2.1, 2.5, 2.0, 2.45, 254677
abc, 20010102. 2.4, 2.6, 2.4, 2.5, 333444
.......
abc, 20060901, 1.5, 2.1, 1.4, 1.9, 123456
I need to create
c:\data\2Processed\abc.csv (historical)
abc, 20010101, 2.1, 2.5, 2.0, 2.45, 254677
abc, 20010102. 2.4, 2.6, 2.4, 2.5, 333444
.......
abc, 20060901, 1.5, 2.1, 1.4, 1.9, 123456
abc, 20060902. 1.9, 2.3, 1.8, 2.3, 147454
So, I am with how to read the files in and merge.
Here is my thought process:
1. Read the files into arrays (of rows)
2. Check the dates of the rows
3. Output the early dates from the historical file
4. Output the common data from either file (probably historical as
already in it)
5. Output new data from new file
So, the code I have so far is this...
puts 'start'
require 'find'
require 'ftools'
dir1original = 'c:/Data/1Original/'
dir2processed = 'c:/Data/2Processed/'
puts 'Here'
Find.find(dir1original) { |path| puts path}
Find.find(dir1original) do |path|
puts 'The current item is ' + path
if File.file? path
puts path + ' is a file'
end
end
puts 'create log files'
# Set up Log files and Specific output files
runlogfile = 'c:/Data/runlog.txt'
open(runlogfile, "w") { |f| f << "Runlog of StepOneIncrement\n"}
puts 'Created runlog file'
open('c:/Data/Exist1not2.txt', "w") {|f| f << "List of files from
Original not in Processed\n"}
puts 'Created Exist1not2'
open('c:/Data/Exist2not1.txt', "w") {|f| f << "List of files from
Processed not in Original\n"}
puts 'Created Exist2not1'
# Walk the Original Directory Tree and check for files and matches
Find.find(dir1original) do |path|
if File.file? path
second = path.gsub(dir1original,dir2processed)
if File.file? second
puts 'Found'
if File.size(path) != File.size(second)
puts 'Not same size'
#Now we will have to look at the data
puts open(path) { |f| f.read(20)}
puts open(second) { |f| f.read(20)}
#search out parsdate for possibly parsing the date data
#need help here on
# read files into an array
# date based calculations
# merging the files
else
puts 'Complete Match'
# if file.cmp(path, second)
end
else
filename = path.gsub(dir1original, '')
puts filename + ' Not Found'
# an alternate method to get the file name
puts File.basename(path) + ' Not Found'
puts File.basename(path, ".csv") + ' Not Found'
open('c:/Data/Exist1not2.txt', "a") {|f| f << filename +"\n"}
File.copy(path,second)
end
end
end
So, some help on the arrays would be GREATLY Appreciated.
Snoopy
···
--
Posted via http://www.ruby-forum.com/.