tf–idf - term frequency–inverse document frequency, http://blog.scriptingsysadmin.com/post... algorithms, statistic