"a string".xor("another string")

I want to do an XOR of two strings:

"a string".xor("another string")

It's not that hard to implement, but it's not fast either,
since it walks through the data string, byte-by-byte.

Any ideas? For example: "It's memory-hungry!". Any solutions?

gegroet,
Erik V. - http://www.erikveen.dds.nl/

···

----------------------------------------------------------------

class String
   def xor(other)
     if other.empty?
       self
     else
       a1 = self.unpack("c*")
       a2 = other.unpack("c*")

       a2 *= 2 while a2.length < a1.length

       a1.zip(a2).collect{|c1,c2| c1^c2}.pack("c*")
     end
   end
end

----------------------------------------------------------------

irb(main):008:0> a="a string"
=> "a string"
irb(main):009:0> b="another string"
=> "another string"
irb(main):014:0> s="";[a.size, b.size].max.times {|i| s << ((a[i]||0) ^ (b[i]||0))}
=> 14
irb(main):015:0> s
=> "\000N\034\000\032\f\034Gstring"

Hm....

  robert

···

On 28.01.2007 21:28, Erik Veenstra wrote:

I want to do an XOR of two strings:

"a string".xor("another string")

It's not that hard to implement, but it's not fast either,
since it walks through the data string, byte-by-byte.

Any ideas? For example: "It's memory-hungry!". Any solutions?

gegroet,
Erik V. - http://www.erikveen.dds.nl/

----------------------------------------------------------------

class String
   def xor(other)
     if other.empty?
       self
     else
       a1 = self.unpack("c*")
       a2 = other.unpack("c*")

       a2 *= 2 while a2.length < a1.length

       a1.zip(a2).collect{|c1,c2| c1^c2}.pack("c*")
     end
   end
end

----------------------------------------------------------------

I tried changing your code to work on machine words (on my x86) by using
"I*" instead of "c*". The result wasn't any faster. I also tried

   class String
   def xor(other)
      ret=dup
      ret.length.times{|n| ret[n]^=other[n]}
      ret
   end
   end

once again, the result wasn't any faster. If you want fast, your best bet
is probably to write it in C.

···

On Mon, 29 Jan 2007 05:28:59 +0900, Erik Veenstra wrote:

I want to do an XOR of two strings:

"a string".xor("another string")

It's not that hard to implement, but it's not fast either,
since it walks through the data string, byte-by-byte.

Any ideas? For example: "It's memory-hungry!". Any solutions?

gegroet,
Erik V. - http://www.erikveen.dds.nl/

----------------------------------------------------------------

class String
   def xor(other)
     if other.empty?
       self
     else
       a1 = self.unpack("c*")
       a2 = other.unpack("c*")

       a2 *= 2 while a2.length < a1.length

       a1.zip(a2).collect{|c1,c2| c1^c2}.pack("c*")
     end
   end
end

----------------------------------------------------------------

--
Ken Bloom. PhD candidate. Linguistic Cognition Laboratory.
Department of Computer Science. Illinois Institute of Technology.
http://www.iit.edu/~kbloom1/

You're padding the shortest string with low values. When using
XOR to encrypt a string (my goal...), uh, you're not encrypting
most of the data:

"very long secret data".xor("secret") # ==> "\005\000\021\vE\030ong
secret data"

The secret key should be repeated several times, so every byte
in the data gets XOR'ed.

In a more general situation, your solution (padding low values)
should do. And String#xor should be general, so I'm wrong...

Does "s << a" generate a new string? If so, we have a lot of
intermediate long strings.

gegroet,
Erik V. - http://www.erikveen.dds.nl/

···

----------------------------------------------------------------

class String
   def xor(other, encrypting=false)
     if other.empty?
       self
     else
       a1 = self.unpack("c*")
       a2 = other.unpack("c*")

       if encrypting
         a2 *= 2 while a2.length < a1.length
       else
         l = [a1.length, a2.length].max

         a1.concat([0x00]*(l - a1.length))
         a2.concat([0x00]*(l - a2.length))
       end

       a1.zip(a2).collect{|c1,c2| c1^c2}.pack("c*")
     end
   end
end

p "very long secret data".xor("secret") # ==>
"\005\000\021\vE\030ong secret data"

p "long string".xor("short") # ==> "\037\a
\001\025Tstring"
p "short".xor("long string") # ==> "\037\a
\001\025Tstring"

message = "This is a secret message."
secret = "secret"

p message.xor(secret, false) # ==> "'\r\n
\001E\035s a secret message."
p message.xor(secret, true) # ==> "'\r\n
\001E\035\000E\002R\026\021\020\027\006\006E
\031\026\026\020\023\002\021]"

p message.xor(secret, false).xor(secret, false) # ==> "This
is a secret message."
p message.xor(secret, true).xor(secret, true) # ==> "This
is a secret message."

----------------------------------------------------------------

Or use an existing C extension :slight_smile:
Using NArray:

require 'narray'
require 'benchmark'

class String
   def xor1(other)
     if other.empty?
       self
     else
       a1 = self.unpack("c*")
       a2 = other.unpack("c*")

       a2 *= 2 while a2.length < a1.length

       a1.zip(a2).collect{|c1,c2| c1^c2}.pack("c*")
     end
   end

   def xor2(other)
      if other.empty?
         self
      else
         if other.length < self.length
            div, mod = self.length.divmod(other.length)
            other = other * div + other[0, mod]
         end

         a1 = NArray.to_na(self, "byte")
         a2 = NArray.to_na(other, "byte")

         (a1 ^ a2).to_s
     end
   end
end

$a1 = "abcdefg" * 1000
$a2 = "hijkl" * 1000
Benchmark.bm do |x|
   x.report("xor1:") {$a1.xor1($a2)}
   x.report("xor2:") {$a1.xor2($a2)}
end
       user system total real
xor1: 0.030000 0.000000 0.030000 ( 0.029777)
xor2: 0.000000 0.000000 0.000000 ( 0.000427)

···

On Mon, 29 Jan 2007 15:08:08 +0000, Ken Bloom wrote:

once again, the result wasn't any faster. If you want fast, your best bet
is probably to write it in C.

Erik Veenstra wrote:

You're padding the shortest string with low values. When using
XOR to encrypt a string (my goal...), uh, you're not encrypting
most of the data:

"very long secret data".xor("secret") # ==> "\005\000\021\vE\030ong secret data"

The secret key should be repeated several times, so every byte
in the data gets XOR'ed.

Robert's solution can be modified:

irb(main):008:0> a="a string"
=> "a string"
irb(main):009:0> b="another string"
=> "another string"
irb(main):014:0> s="";[a.size, b.size].max.times {|i| s << ((a[i]||0) ^ (b[i]||0))}
=> 14

This becomes:

s=""
[a.size, b.size].max.times {|i|
   s << ((a[i%a.size]||0)^(b[i%b.size]||0))
}

That takes care of the looping. << doesn't create a new string (at least, not to my knowledge...) but it will (I presume) cause reallocation. This should get around that:

s = " " * [a.size,b.size].max
[a.size, b.size].max.times {|i|
   s[i] = ((a[i%a.size]||0)^(b[i%b.size]||0))
}

No idea if it's faster or slower, and it's past my bedtime so I'm not going to benchmark it right now :slight_smile:

···

--
Alex

Erik Veenstra wrote:

You're padding the shortest string with low values. When using
XOR to encrypt a string (my goal...), uh, you're not encrypting
most of the data:

If you want to encrypt the data, maybe you should use actual
cryptographic functions?
   "a string".crypt("another string")
That's probably more secure than a homegrown xor function

Daniel

THANK YOU!

I added a bit of code to detect the presence of this narray. It
falls back to the Ruby implementation if narray isn't installed.

gegroet,
Erik V. - http://www.erikveen.dds.nl/

···

----------------------------------------------------------------

class String
   def xor_slow(other)
     if other.empty?
       self
     else
       a1 = self.unpack("c*")
       a2 = other.unpack("c*")

       a2 *= a1.length/a2.length + 1

       a1.zip(a2).collect{|c1,c2| c1^c2}.pack("c*")
     end
   end

   def xor_fast(other)
     if other.empty?
       self
     else
       if other.length < self.length
         div, mod = self.length.divmod(other.length)
         other = other * div + other[0, mod]
       end

       a1 = NArray.to_na(self, "byte")
       a2 = NArray.to_na(other, "byte")

       (a1 ^ a2).to_s
     end
   end

   begin
     require "narray"

     alias :xor :xor_fast
   rescue LoadError
     if $VERBOSE
       $stderr.puts "I strongly advise to install the library
'narray'."
     end

     alias :xor :xor_slow
   end
end

----------------------------------------------------------------

I've found a little bug in your version:

long = "this is a very long string"
short = "a short one"

p long.xor(short) # ==> OK
p short.xor(long) # ==> Array size mismatch: 11 != 26

See this fix.

gegroet,
Erik V. - http://www.erikveen.dds.nl/

···

----------------------------------------------------------------

require "narray"

class String
   def xor(other)
     if other.empty?
       self
     else
       s1 = self
       s2 = other

       if s1.length < s2.length
         n, r = s2.length.divmod(s1.length)
         s1 = s1 * n + s1[0, r]
       elsif s2.length < s1.length
         n, r = s1.length.divmod(s2.length)
         s2 = s2 * n + s2[0, r]
       end

       a1 = NArray.to_na(s1, "byte")
       a2 = NArray.to_na(s2, "byte")

       (a1 ^ a2).to_s
     end
   end
end

----------------------------------------------------------------

String#crypt is a hash function, not an encrypt function,
although it's called "encrypt"... ;]

gegroet,
Erik V. - http://www.erikveen.dds.nl/

sure would be great to have narray in the core eh? sigh.

-a

···

On Tue, 30 Jan 2007, Erik Veenstra wrote:

THANK YOU!

I added a bit of code to detect the presence of this narray. It
falls back to the Ruby implementation if narray isn't installed.

gegroet,
Erik V. - http://www.erikveen.dds.nl/

--
we can deny everything, except that we have the possibility of being better.
simply reflect on that.
- the dalai lama