Statistics Module for Ruby back

Ruby and Statistics
From time to time I find it useful to do some simple number crunching in Ruby. Here are some of my utility functions.

Warning: This script is provided as is with no warranty implied or otherwise. Users must determine whether it is fit for any particular purpose, as we, the authors, make no claim to that end. Use it at your own risk.

ruby-stats is licensed under the Academic Free License v3.0.

That being said..

ruby-stats


You can download ruby-stats here
#!/usr/bin/ruby
###############

#############################
# Statistics Module for Ruby
# (C) Derrick Pallas
#
# Authors: Derrick Pallas
# Website: http://derrick.pallas.us/ruby-stats/
# License: Academic Free License 3.0
# Version: 2007-10-01b
#

class Numeric
  def square ; self * self ; end
end

class Array
  def sum ; self.inject(0){|a,x|x+a} ; end
  def mean ; self.sum.to_f/self.size ; end
  def median
    case self.size % 2
      when 0 then self.sort[self.size/2-1,2].mean
      when 1 then self.sort[self.size/2].to_f
    end if self.size > 0
  end
  def histogram ; self.sort.inject({}){|a,x|a[x]=a[x].to_i+1;a} ; end
  def mode
    map = self.histogram
    max = map.values.max
    map.keys.select{|x|map[x]==max}
  end
  def squares ; self.inject(0){|a,x|x.square+a} ; end
  def variance ; self.squares.to_f/self.size - self.mean.square; end
  def deviation ; Math::sqrt( self.variance ) ; end
  def permute ; self.dup.permute! ; end
  def permute!
    (1...self.size).each do |i| ; j=rand(i+1)
      self[i],self[j] = self[j],self[i] if i!=j
    end;self
  end
  def sample n=1 ; (0...n).collect{ self[rand(self.size)] } ; end
end

if __FILE__ == $0
  fields = []
  $stdin.each do |line|
    data = line.chomp.split("\t")
    data.each_index do |i|
      fields[i] = [] if fields[i].nil?
      fields[i] << data[i].to_f if data[i].size > 0
    end
  end

  fields.each_index do |i|
    next unless fields[i].size > 0
    puts [ i,  fields[i].mean, fields[i].deviation ] \
         .collect{|x|x.to_s}.join("\t")
  end
end

# END
######

An example.


Based on example 18-6 from Resampling: The New Statistcs, given the weights of pigs in groups given different feed, what is the likelyhood that the differences in each group's mean weight is due to chance?
  #!/usr/bin/ruby
  require 'stats'
  A=%w{34 29 26 32 35 38 31 34 30 29 32 31}.collect{|x|x.to_i}
  B=%w{26 24 28 29 30 29 32 26 31 29 32 28}.collect{|x|x.to_i}
  C=%w{30 30 32 31 29 27 25 30 31 32 34 33}.collect{|x|x.to_i}
  D=%w{32 25 31 26 32 27 28 29 29 28 23 25}.collect{|x|x.to_i}

  U = A + B + C + D
  e = [ A.mean, B.mean, C.mean, D.mean ].variance
  n = 4000
  t = \
  (0...n).inject(0) do |a,x|
    ([ U.permute![0...A.size].mean,
       U.permute![0...B.size].mean,
       U.permute![0...C.size].mean,
       U.permute![0...D.size].mean,
    ].variance>=e) ? a+1 : a
  end
  puts t.to_f / n
  #
  

 
Anyone attempting to generate random numbers by deterministic means is, of course, living in a state of sin.

 
   
org DOT telperion AT pallas