參考網址:
A Comprehensive Comparative Study on Term Weighting Schemes for Text Categorization with SVM
Proposing a New Term Weighting Scheme for Text Categorization
Supervised and Traditional Term Weighting Methods for Automatic Text Categorization
#!/usr/bin/ruby -Kuw
#ruby19 #輸入格式 #類別,總詞數,t1,t2,t3 #20090304
f=ARGV[0]
m=[] rows=0 cols=0 File.open("#{f}") do |line| while data=line.gets cols=data.split(',').size if cols==0 m << data.split(',') rows += 1 end end
rfa=[] #rf array for j in 2..cols-1 a=c=0 for i in 0..rows-1 a+=1 if m[i][0].to_i == 1 and m[i][j].to_i > 0 c+=1 if m[i][0].to_i != 1 and m[i][j].to_i > 0 end c=1 if c==0
rfa << (a.to_f**2)/c.to_f end
tmp=[] tfrfa=[] for i in 0..rows-1 tmp << m[i][0] for j in 2..cols-1 tf=m[i][j].to_f/m[i][1].to_f rf=Math.log2(2+rfa[j-2]) tmp << sprintf("%2.4f",tf*rf) end puts tmp.join(',') tmp=[] end
|
Supervised and Traditional Term Weighting Methods for Automatic Text Categorization
留言列表