Ruby and Statistics
From time to time I find it useful to do some simple number crunching in
Ruby. Here are some of my utility
functions.
Warning:
This script is provided as is with no warranty implied or otherwise. Users must
determine whether it is fit for any particular purpose, as we, the authors, make
no claim to that end. Use it at your own risk.
ruby-stats is licensed under the
Academic Free License v3.0.
That being said..
ruby-stats
You can download ruby-stats here
#!/usr/bin/ruby
###############
#############################
# Statistics Module for Ruby
# (C) Derrick Pallas
#
# Authors: Derrick Pallas
# Website: http://derrick.pallas.us/ruby-stats/
# License: Academic Free License 3.0
# Version: 2007-10-01b
#
class Numeric
def square ; self * self ; end
end
class Array
def sum ; self.inject(0){|a,x|x+a} ; end
def mean ; self.sum.to_f/self.size ; end
def median
case self.size % 2
when 0 then self.sort[self.size/2-1,2].mean
when 1 then self.sort[self.size/2].to_f
end if self.size > 0
end
def histogram ; self.sort.inject({}){|a,x|a[x]=a[x].to_i+1;a} ; end
def mode
map = self.histogram
max = map.values.max
map.keys.select{|x|map[x]==max}
end
def squares ; self.inject(0){|a,x|x.square+a} ; end
def variance ; self.squares.to_f/self.size - self.mean.square; end
def deviation ; Math::sqrt( self.variance ) ; end
def permute ; self.dup.permute! ; end
def permute!
(1...self.size).each do |i| ; j=rand(i+1)
self[i],self[j] = self[j],self[i] if i!=j
end;self
end
def sample n=1 ; (0...n).collect{ self[rand(self.size)] } ; end
end
if __FILE__ == $0
fields = []
$stdin.each do |line|
data = line.chomp.split("\t")
data.each_index do |i|
fields[i] = [] if fields[i].nil?
fields[i] << data[i].to_f if data[i].size > 0
end
end
fields.each_index do |i|
next unless fields[i].size > 0
puts [ i, fields[i].mean, fields[i].deviation ] \
.collect{|x|x.to_s}.join("\t")
end
end
# END
######
An example.
Based on example 18-6 from Resampling: The New Statistcs,
given the weights of pigs in groups given different feed, what is the
likelyhood that the differences in each group's mean weight is due to chance?
#!/usr/bin/ruby
require 'stats'
A=%w{34 29 26 32 35 38 31 34 30 29 32 31}.collect{|x|x.to_i}
B=%w{26 24 28 29 30 29 32 26 31 29 32 28}.collect{|x|x.to_i}
C=%w{30 30 32 31 29 27 25 30 31 32 34 33}.collect{|x|x.to_i}
D=%w{32 25 31 26 32 27 28 29 29 28 23 25}.collect{|x|x.to_i}
U = A + B + C + D
e = [ A.mean, B.mean, C.mean, D.mean ].variance
n = 4000
t = \
(0...n).inject(0) do |a,x|
([ U.permute![0...A.size].mean,
U.permute![0...B.size].mean,
U.permute![0...C.size].mean,
U.permute![0...D.size].mean,
].variance>=e) ? a+1 : a
end
puts t.to_f / n
#
|