File position and buffers

Hi all,

In a bit of a rut. Have a file with a lot of text. I want to seperate
the text in this file as entries. Each entry that I would be seperating,
would be done so using IO.pos and when that cursor reaches a certain
character in the file, it will ideally place all the content before that
character into a buffer. Then the cursor will continue reading until it
hits that same character again and put that content into a buffer, so on
and so forth. (Character I'll be reading would be a greater than symbol)

Would I use a do iterator or use a while loop with a gets method? Or
readlines perhaps?

File:

entry 1

rubyrubyrubyrubyrubyrubyrubyruby
(newline here which I don't want)

entry 2

rubyrubyrubyrubyrubyrubyrubyruby

Entry1 and entry2 will be in seperate buffers which I would be able to
access again.

buffer1 = >entry 1
rubyrubyrubyrubyrubyrubyrubyruby

buffer2 = >entry 2
rubyrubyrubyrubyrubyrubyrubyruby

PS. The file is huge, so I don't want to read it into memory. What is
the best way to approach this? Any suggestions or comments would be
helpful. Thanks!

···

--
Posted via http://www.ruby-forum.com/\.

You could use foreach checking if each line starts with '>'. If it doesn't
you accumulate in a buffer; if it does you do something with the current
buffer and start a new one.

Jesus

···

El 27/04/2011 22:04, "Cee Joe" <cyril_jose@ymail.com> escribió:

hi Cee -

  this may well be WAY to simple for your needs, but it seems to me you
could so something like this:

(0text.txt is a file with 7 lines that say rubyrubyrubyetc.)

  f = "0text.txt"
  file = File.open(f)
  buffer = []
  bufferindex = 0

  file.each_line{|line|
       buffer[bufferindex] = line.chomp
       bufferindex += 1
  }

p buffer[0]
p buffer[1]
p buffer[2]
#etc...

  of course you could also set a maximum number of lines per buffer:

  f = "0text.txt"
  file = File.open(f)
  buffer = Hash.new{|key, value| key[value]= []}
  bufferkey = 0
  maxbuflength = 3

  file.each_line{|line|
    if buffer[bufferkey].length == maxbuflength
      bufferkey +=1
      buffer[bufferkey] << line.chomp
    else
      buffer[bufferkey] << line.chomp
    end
  }

p buffer[0]
p buffer[1]
p buffer[2]

  if the file's extremely long i guess you'd want to write a method to
dump the buffers at some point too.

  maybe this is dumb, i hope not!
  cheers,

  -j

···

--
Posted via http://www.ruby-forum.com/.

Cee Joe wrote in post #995381:

Hi all,

In a bit of a rut. Have a file with a lot of text. I want to seperate
the text in this file as entries. Each entry that I would be seperating,
would be done so using IO.pos and when that cursor reaches a certain
character in the file, it will ideally place all the content before that
character into a buffer. Then the cursor will continue reading until it
hits that same character again and put that content into a buffer, so on
and so forth. (Character I'll be reading would be a greater than symbol)

There is absolutely no reason to use pos() to read that file.

Would I use a do iterator or use a while loop with a gets method? Or
readlines perhaps?

File:

entry 1

rubyrubyrubyrubyrubyrubyrubyruby
(newline here which I don't want)

chomp() removes one newline, if present, at the end of a string.

PS. The file is huge, so I don't want to read it into memory. What is
the best way to approach this? Any suggestions or comments would be
helpful. Thanks!

Well, then you have to tell us what you want to do with the segments of
the file. If you store each chunk in a variable, then you will have
read the whole file into memory.

You say your file looks like this:

entry 1 <---WHAT'S AT THE END OF THIS LINE??

rubyrubyrubyrubyruby <---WHAT'S AT THE END OF THIS LINE??
(newline here which I don't want)

Those look like newlines. Are you saying that your data is organized
into paragraphs, i.e. separated by two newlines? Like this:

entry1\n

rubyrubyruby\n
\n

entry2\n

rubyrubyruby\n
\n

entry3

A paragraph is defined as two consective newlines between lines. Note
that in ruby the default line separator is one newline. But you can
change that to two newlines--or any other character:

require 'stringio'

str =<<ENDOFSTRING

entry1

11111111111

entry2

22222222222

entry3

33333333333
ENDOFSTRING

input = StringIO.new(str)
$/ = "\n\n"

input.each do |para|
  p para.sub(/\n+ \z/xms, "")
end

--output:--
">entry1\n11111111111"
">entry2\n22222222222"
">entry3\n33333333333"

···

--
Posted via http://www.ruby-forum.com/\.

This shows the output better:

e = input.enum_for(:each) #You can do this for a File too.

e.each_slice(2) do |buffer1, buffer2|
  puts "buffer1: #{buffer1.inspect}"
  puts "buffer2: #{buffer2.inspect}"
  puts "-" * 10
end

--output:--
buffer1: ">entry1\n11111111111\n\n"
buffer2: ">entry2\n22222222222\n\n"

···

----------
buffer1: ">entry3\n33333333333\n"
buffer2: nil
----------

Before doing the sub() on buffer2, you will have to check if its nil:

  if buffer2.nil?
    #don't do a sub()
  else
    #do the sub()
  end

--
Posted via http://www.ruby-forum.com/.

One of the simplest approaches is to use Ruby's ability to use
arbitrary record delimiters:

File.foreach file_name, ">" do |chunk|
  chunk.chomp! ">"
  chunk.gsub! /\r\n?|\n/, '' # remove line terminators
  # if you need the leading ">":
  # chunk[0,0] = ">"
  p chunk
end

Kind regards

robert

···

On Wed, Apr 27, 2011 at 10:02 PM, Cee Joe <cyril_jose@ymail.com> wrote:

Hi all,

In a bit of a rut. Have a file with a lot of text. I want to seperate
the text in this file as entries. Each entry that I would be seperating,
would be done so using IO.pos and when that cursor reaches a certain
character in the file, it will ideally place all the content before that
character into a buffer. Then the cursor will continue reading until it
hits that same character again and put that content into a buffer, so on
and so forth. (Character I'll be reading would be a greater than symbol)

Would I use a do iterator or use a while loop with a gets method? Or
readlines perhaps?

File:

entry 1

rubyrubyrubyrubyrubyrubyrubyruby
(newline here which I don't want)

entry 2

rubyrubyrubyrubyrubyrubyrubyruby

Entry1 and entry2 will be in seperate buffers which I would be able to
access again.

buffer1 = >entry 1
rubyrubyrubyrubyrubyrubyrubyruby

buffer2 = >entry 2
rubyrubyrubyrubyrubyrubyrubyruby

PS. The file is huge, so I don't want to read it into memory. What is
the best way to approach this? Any suggestions or comments would be
helpful. Thanks!

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Thanks guys for your helpful comments. I will be more descriptive. I am
an intern and my mentor wants me to use the IO.pos to read the
characters of the file until the character reaches the ">" symbol. SO
upon the cursor reaching the ">" symbol(which is the start of a new
entry), he wants me to place that previous entry in a buffer. Here is
the actual test file I am working with:

gi>329295464|ref|NM_2005745.3Acc1| Def1 zgc:65895 (zgc:65895), mRNA\n

AGCTCGGGGGCTCTAGCGATTTAAGGAGCGATGCGATCGAGCTGACCGTCGCG\n
\n

gi>456299107|ref|NM_2342343.3Acc2| Def2 zgc:65895 (zgc:65895), mRNA\n

GTCGCTGGGTCGAAAAGTGGTGCTATATCGCGGCTCGCGTCGATGTCGCGATG\n
CGTGCGCGCGAGAGCGCGCTATGATGAAAGGATGAGAGAG\n
\n

gi>3542945647|ref|NM_7453343.5Acc3| Def3 zgc:65895 (zgc:65895), mRNA\n

CGTGCGGGGABCCGTACGTGCCGTGGGGGTTTAATAGCGCGCCATCTGAGCAG\n
TTAGTCGCTGACGCATGCACG\n
\n

7stud, you are right there are two consecutive newlines which I failed
to mention. This should be the output of a buffer for one entry:

gi>456299107|ref|NM_2342343.3Acc2| Def2 zgc:65895 (zgc:65895), mRNA\n

GTCGCTGGGTCGAAAAGTGGTGCTATATCGCGGCTCGCGTCGATGTCGCGATG <-- no "\n"
CGTGCGCGCGAGAGCGCGCTATGATGAAAGGATGAGAGAG <-- no "\n"

Notice how the newlines are gone. So with the exception of the header in
each entry, the newlines should be gone and be placed in a buffer. I am
lost on how to use the IO.pos and a file iterator to make sure each
respective entry goes into a buffer without the file being indexed into
memory.

Thanks in advance, I'm new to the language and trying to wrap my head
around it.

···

--
Posted via http://www.ruby-forum.com/\.

You still have not told us what you are supposed to do with the stuff
you read in?? You can read a file line by line and print out each line
as you go and the maximum amount of memory used will be one line's
worth. However, if you are supposed to store all the lines in an
array, then you will read the whole file into memory.

Thanks guys for your helpful comments. I will be more
descriptive. I am an intern and my mentor wants me to
use the IO.pos to read the characters of the file
until the character reaches the ">" symbol.

What problems is that giving you? You can create a loop, read the
character at pos(i), then increment i, and do what Jesús Gabriel y Galán
suggested.

···

--
Posted via http://www.ruby-forum.com/\.

If you don't have to use pos(), then see my first post.

···

--
Posted via http://www.ruby-forum.com/.

hi Cee -

  copying the text you posted above into the file "0text.txt" and
running this:

  f = "0text.txt"
  file = File.open(f)
  buffer = []
  bufferindex = 0

  file.each_line(sep=">"){|line|
       buffer[bufferindex] = line.chomp
       bufferkey+=1
  }

p buffer[0]
p buffer[1]
p buffer[2]
p buffer[3]

  i get this as output:

#=> ">"
#=> "gi|329295464|ref|NM_2005745.3Acc1| Def1 zgc:65895 (zgc:65895),
mRNA\\n\nAGCTCGGGGGCTCTAGCGATTTAAGGAGCGATGCGATCGAGCTGACCGTCGCG\\n\n\\n\n>"
#=> "gi|456299107|ref|NM_2342343.3Acc2| Def2 zgc:65895 (zgc:65895),
mRNA\\n\nGTCGCTGGGTCGAAAAGTGGTGCTATATCGCGGCTCGCGTCGATGTCGCGATG\\n\nCGTGCGCGCGAGAGCGCGCTATGATGAAAGGATGAGAGAG\\n\n\\n\n>"
#=> "gi|3542945647|ref|NM_7453343.5Acc3| Def3 zgc:65895 (zgc:65895),
mRNA\\n\nCGTGCGGGGABCCGTACGTGCCGTGGGGGTTTAATAGCGCGCCATCTGAGCAG\\n\nTTAGTCGCTGACGCATGCACG\\n\n\\n"

  does this work for you? you could easily write ways to deal with,
dump, and reset the buffers when they fill up. you can of course also
clean up all the "\n"'s...

  i agree with 7stud that using #.pos and #.gets seems like a long walk
off a short pier. i'm pretty green myself, and there are probably
better ways to iterate through the file, but #.each_line(sep=">") works
just fine, and doesn't eat up memory.

  - j

···

--
Posted via http://www.ruby-forum.com/.

The first thing everyone in this thread needs to realize is that '>' is
not the separator you want to look for. That's because you don't care
what character marks the beginning of every entry, rather you care what
character marks the end of every entry. The end of every entry is
marked by the string "\n\n", so you should use that has your input line
terminator. Remember, ruby uses "\n" for the input line separator by
default, which means that when you read a file using IO#each, ruby reads
lines--where the end of a line is marked by a newline. However, you can
change the input line separator to the string "\n\n" (or any other
string):

$/ = "\n\n"

Once you have an entry, then you just need to do a little housekeeping
and remove some "\n" characters.

require 'stringio'

str =<<ENDOFSTRING

gi>329295464|ref|NM_2005745.3Acc1| Def1 zgc:65895 (zgc:65895), mRNA

AGCTCGGGGGCTCTAGCGATTTAAGGAGCGATGCGATCGAGCTGACCGTCGCG

gi>456299107|ref|NM_2342343.3Acc2| Def2 zgc:65895 (zgc:65895), mRNA

GTCGCTGGGTCGAAAAGTGGTGCTATATCGCGGCTCGCGTCGATGTCGCGATG
CGTGCGCGCGAGAGCGCGCTATGATGAAAGGATGAGAGAG

gi>3542945647|ref|NM_7453343.5Acc3| Def3 zgc:65895 (zgc:65895), mRNA

CGTGCGGGGABCCGTACGTGCCGTGGGGGTTTAATAGCGCGCCATCTGAGCAG
TTAGTCGCTGACGCATGCACG

ENDOFSTRING

input = StringIO.new(str) #Now input is just like a File

input.each(sep = "\n\n") do |para|
  buffer = ''

  lines = para.split("\n")
  buffer << lines.shift << "\n"
  lines.each do |line|
    buffer << line
  end

  puts buffer
  puts "-" * 20
end

p $/

--output:--

gi>329295464|ref|NM_2005745.3Acc1| Def1 zgc:65895 (zgc:65895), mRNA

AGCTCGGGGGCTCTAGCGATTTAAGGAGCGATGCGATCGAGCTGACCGTCGCG

···

--------------------

gi>456299107|ref|NM_2342343.3Acc2| Def2 zgc:65895 (zgc:65895), mRNA

GTCGCTGGGTCGAAAAGTGGTGCTATATCGCGGCTCGCGTCGATGTCGCGATGCGTGCGCGCGAGAGCGCGCTATGATGAAAGGATGAGAGAG
--------------------

gi>3542945647|ref|NM_7453343.5Acc3| Def3 zgc:65895 (zgc:65895), mRNA

CGTGCGGGGABCCGTACGTGCCGTGGGGGTTTAATAGCGCGCCATCTGAGCAGTTAGTCGCTGACGCATGCACG
--------------------
"\n"

Note that specifying the new input line separator as an argument to
each() serves to restore the original input line separator once the
block has finished--which is a good thing.

--
Posted via http://www.ruby-forum.com/\.

hi Cee -

  hmm, i'm getting a bit confused as to what exactly you're trying to do
- but if you want to load all this stuff into a buffer without the
newlines, and regardless of how many newlines you have between each
entry (assuming that an "entry" is something that starts with ">") - i
don't see why this wouldn't work:

f = "0text.txt"
file = File.open(f)
buffer =
bufferindex = 0

file.each(sep = ">"){|line|
  buffer[bufferindex] = line
  bufferindex += 1
}

## here you would do something more interesting
buffer.collect{|line|
  line = line.delete("\n")
  p ">#{line}"
}

  which will return...
">>"
">gi|329295464|ref|NM_2005745.3Acc1| Def1 zgc:65895 (zgc:65895),

"

">gi|456299107|ref|NM_2342343.3Acc2| Def2 zgc:65895 (zgc:65895),

"

">gi|3542945647|ref|NM_7453343.5Acc3| Def3 zgc:65895 (zgc:65895),
mRNACGTGCGGGGABCCGTACGTGCCGTGGGGGTTTAATAGCGCGCCATCTGAGCAGTTAGTCGCTGACGCATGCACG"

  ...whether you have 0 or 100,000 newlines between each entry. is this
not what you're looking for?

-j

···

--
Posted via http://www.ruby-forum.com/\.

Hi Jake,

I would still need the header intact, which should be away from the rest
of the entry:

gi>329295464|ref|NM_2005745.3Acc1| Def1 zgc:65895 (zgc:65895), mRNA

AGCTCGGGGGCTCTAGCGATTTAAGGAGCGATGCGATCGAGCTGACCGTCGCG

instead of:

">gi|329295464|ref|NM_2005745.3Acc1| Def1 zgc:65895 (zgc:65895),

"

Still need the header line so I can extract information from that and
the lines after that in each entry. Your delete() will delete all the
newlines, which would not be beneficial in this scenario. Thanks for the
input, appreciate it.

-Cee

···

--
Posted via http://www.ruby-forum.com/\.

Robert K. wrote in post #995478:

···

On Wed, Apr 27, 2011 at 10:02 PM, Cee Joe <cyril_jose@ymail.com> wrote:

Would I use a do iterator or use a while loop with a gets method? Or
access again.
helpful. Thanks!

One of the simplest approaches is to use Ruby's ability to use
arbitrary record delimiters:

File.foreach file_name, ">" do |chunk|
  chunk.chomp! ">"
  chunk.gsub! /\r\n?|\n/, '' # remove line terminators

Cee Joe, are you reading the file in binary mode or text mode?

--
Posted via http://www.ruby-forum.com/\.

7stud -- wrote in post #995581:

You still have not told us what you are supposed to do with the stuff
you read in?? You can read a file line by line and print out each line
as you go and the maximum amount of memory used will be one line's
worth. However, if you are supposed to store all the lines in an
array, then you will read the whole file into memory.

Thanks guys for your helpful comments. I will be more
descriptive. I am an intern and my mentor wants me to
use the IO.pos to read the characters of the file
until the character reaches the ">" symbol.

I am extracting text from each entry I read in, something I have figured
out already. I want to read the file line by line and just store each
entry into a buffer when it reaches the ">" symbol. THen extract
specific info from it later. The entry lengths all vary as there long
and short lengths. File is in text mode.

What problems is that giving you? You can create a loop, read the
character at pos(i), then increment i, and do what Jesús Gabriel y Galán
suggested.

Could you show me a simple example or refer me to a link?

···

--
Posted via http://www.ruby-forum.com/\.

7stud -- wrote in post #995683:

If you don't have to use pos(), then see my first post. At some point,
you might ask him why he thinks that pos() would be of any help at all!

Thanks jake and 7stud for replying. I tried this in irb for your first
post:

e = File.open("test/test.fasta").enum_for(:each)

=> #<Enumerable::Enumerator:0x1005777a8>

$/ = "\n\n"

=> "\n\n"

Before doing the sub() on buffer2, you will have to check if it's nil:

>if buffer2.nil?
> #don't do a sub()
> else
> #do the sub()
>end

e.each_slice(2) do |buf1, buf2|

?> p buf1, buf2

if buf2.nil?
puts "Done"
else

?> buf2.sub(/\n+ \z/xms, "")

end
end

Output:
">gi|329299107|ref|NM_2005745.3Acc1| Def1 zgc:65895 (zgc:65895),
mRNA\nAGCTCGGGGGCTCTAGCGATTTAAGGAGCGATGCGATCGAGCTGACCGTCGCG\n\n"
">gi|329299107|ref|NM_2342343.3Acc2| Def2 zgc:65895 (zgc:65895),
mRNA\nGTCGCTGGGTCGAAAAGTGGTGCTATATCGCGGCTCGCGTCGATGTCGCGATG\nCGTGCGCGCGAGAGCGCGCTATGATGAAAGGATGAGAGAG\n\n"
">gi|329299107|ref|NM_7453343.5Acc3| Def3 zgc:65895 (zgc:65895),
mRNA\nCGTGCGGGGABCCGTACGTGCCGTGGGGGTTTAATAGCGCGCCATCTGAGCAG\nTTAGTCGCTGACGCATGCACG\n"
nil
Done
=> nil

It still returns nil, am I doing what you suggested wrong?

···

--
Posted via http://www.ruby-forum.com/\.

7stud -- wrote in post #995821:

I suggest that people never use irb because it has too many quirks.

The first thing you need to realize is that '>' is
not the separator you want to look for. That is the second bit of
erroneous advice your mentor gave you. That's because you don't care
what character marks the beginning of every entry, rather you care what
character marks the end of every entry. The end of every entry in your
file is marked by the string "\n\n", so you should use that as your
input line terminator. Remember, ruby uses "\n" for the input line
separator by default, which means that when you read a file using
IO#each, ruby reads lines--where the end of a line is marked by a
newline.

I understand the logic, it makes sense. What if the file looked like
this, where there is one newline seperating the entries? :

gi>329295464|ref|NM_2005745.3Acc1| Def1 zgc:65895 (zgc:65895), mRNA

AGCTCGGGGGCTCTAGCGATTTAAGGAGCGATGCGATCGAGCTGACCGTCGCG

gi>456299107|ref|NM_2342343.3Acc2| Def2 zgc:65895 (zgc:65895), mRNA

GTCGCTGGGTCGAAAAGTGGTGCTATATCGCGGCTCGCGTCGATGTCGCGATG
CGTGCGCGCGAGAGCGCGCTATGATGAAAGGATGAGAGAG

gi>3542945647|ref|NM_7453343.5Acc3| Def3 zgc:65895 (zgc:65895), mRNA

CGTGCGGGGABCCGTACGTGCCGTGGGGGTTTAATAGCGCGCCATCTGAGCAG
TTAGTCGCTGACGCATGCACG

Would an if-else(regarding"\n" and "\n\n") do the trick? I wanted to
write my code to where it would handle both scenarios. Or maybe:

case
  when "\n\n"
    <code>
  when "\n"
    <code>
end

something to that extent? Suggestions?

···

--
Posted via http://www.ruby-forum.com/\.

7stud -- wrote in post #995589:

Cee Joe, are you reading the file in binary mode or
text mode?

If you don't know, then show us the line in your code where you open the
file.

···

--
Posted via http://www.ruby-forum.com/\.

Cee Joe wrote in post #995597:

my mentor wants me to use the IO.pos to read the
characters of the file until the character reaches the ">" symbol.

IO.pos() does not read in data, so you are going to have to ask your
mentor what he means. You should also ask your mentor if this is a
lesson in how not to do things. If he doesn't reply in the affirmative,
then you should find a new mentor.

I am extracting text from each entry I read in, something I have figured
out already. I want to read the file line by line and just store each
entry into a buffer when it reaches the ">" symbol. THen extract
specific info from it later.

You told us you were not supposed to read the whole file into memory.
If you store every line in an array, then you will have read the whole
file into memory. Once again, you are not being clear on what you want
to do with the data. You need to tell us which of the following you
want to do:

1) Store every entry in an array, and "extract specific info from it
later".

2) Read one entry, do something to the entry, then discard it and read
in the next entry.

The entry lengths all vary as there long
and short lengths. File is in text mode.

Ok.

What problems is that giving you? You can create a loop, read the
character at pos(i), then increment i, and do what Jesús Gabriel y Galán
suggested.

You could use each_byte to read the file char by char (that assumes your
file contains all ascii characters), then when you find a '>', seek()
back to the start of the file, and use IO.sysread() to read:

old_pos = 0
pos() - old_pos

number of characters. Then do something like:

old_pos = pos()

and keep doing that. But, you will be reading every entry twice, which
is stupid.

···

--
Posted via http://www.ruby-forum.com/\.

Cee Joe wrote in post #995830:

7stud -- wrote in post #995821:

I suggest that people never use irb because it has too many quirks.

The first thing you need to realize is that '>' is
not the separator you want to look for. That is the second bit of
erroneous advice your mentor gave you. That's because you don't care
what character marks the beginning of every entry, rather you care what
character marks the end of every entry. The end of every entry in your
file is marked by the string "\n\n", so you should use that as your
input line terminator. Remember, ruby uses "\n" for the input line
separator by default, which means that when you read a file using
IO#each, ruby reads lines--where the end of a line is marked by a
newline.

I understand the logic, it makes sense. What if the file looked like
this, where there is one newline seperating the entries? :

What if you had presented that possibility from the very beginning?

require 'stringio'

str =<<ENDOFSTRING

gi>329295464|ref|NM_2005745.3Acc1| Def1 zgc:65895 (zgc:65895), mRNA

AGCTCGGGGGCTCTAGCGATTTAAGGAGCGATGCGATCGAGCTGACCGTCGCG

gi>456299107|ref|NM_2342343.3Acc2| Def2 zgc:65895 (zgc:65895), mRNA

GTCGCTGGGTCGAAAAGTGGTGCTATATCGCGGCTCGCGTCGATGTCGCGATG
CGTGCGCGCGAGAGCGCGCTATGATGAAAGGATGAGAGAG

gi>3542945647|ref|NM_7453343.5Acc3| Def3 zgc:65895 (zgc:65895), mRNA

CGTGCGGGGABCCGTACGTGCCGTGGGGGTTT
AATAGCGCGCCATCTGAGCAG
TTAGTCGCTGACGCATGCACG

ENDOFSTRING

input = StringIO.new(str)
buffer = ''

input.each do |line|
  if line[0, 1] == '>'
    if buffer != ''
      puts buffer #or do something else to buffer
      puts '-' * 20
    end

    buffer = ''
    buffer << line
  else
    buffer << line.sub(/ \n+ \z /xms, '')
  end

end

puts buffer #or do something else to buffer

--output:--

gi>329295464|ref|NM_2005745.3Acc1| Def1 zgc:65895 (zgc:65895), mRNA

AGCTCGGGGGCTCTAGCGATTTAAGGAGCGATGCGATCGAGCTGACCGTCGCG

···

--------------------

gi>456299107|ref|NM_2342343.3Acc2| Def2 zgc:65895 (zgc:65895), mRNA

GTCGCTGGGTCGAAAAGTGGTGCTATATCGCGGCTCGCGTCGATGTCGCGATGCGTGCGCGCGAGAGCGCGCTATGATGAAAGGATGAGAGAG
--------------------

gi>3542945647|ref|NM_7453343.5Acc3| Def3 zgc:65895 (zgc:65895), mRNA

CGTGCGGGGABCCGTACGTGCCGTGGGGGTTTAATAGCGCGCCATCTGAGCAGTTAGTCGCTGACGCATGCACG

--
Posted via http://www.ruby-forum.com/\.