close

參考網址:

 A Comprehensive Comparative Study on Term Weighting Schemes for Text Categorization with SVM

Proposing a New Term Weighting Scheme for Text Categorization

Supervised and Traditional Term Weighting Methods for Automatic Text Categorization

#!/usr/bin/ruby -Kuw

#ruby19
#輸入格式
#類別,總詞數,t1,t2,t3
#20090304

f=ARGV[0]

m=[]
rows=0
cols=0
File.open("#{f}") do |line|
  while data=line.gets
    cols=data.split(',').size if cols==0
    m << data.split(',')
    rows += 1
  end
end

rfa=[] #rf array
for j in 2..cols-1
  a=c=0
  for i in 0..rows-1
    a+=1 if m[i][0].to_i == 1 and m[i][j].to_i > 0
    c+=1 if m[i][0].to_i != 1 and m[i][j].to_i > 0
  end
c=1 if c==0

rfa << (a.to_f**2)/c.to_f
end

tmp=[]
tfrfa=[]
for i in 0..rows-1
  tmp << m[i][0]
  for j in 2..cols-1
    tf=m[i][j].to_f/m[i][1].to_f
    rf=Math.log2(2+rfa[j-2])
    tmp << sprintf("%2.4f",tf*rf)
  end
  puts tmp.join(',')
  tmp=[]
end


Supervised and Traditional Term Weighting Methods for Automatic Text Categorization

arrow
arrow
    全站熱搜

    igogo 發表在 痞客邦 留言(0) 人氣()