database - how to solve this with kmeans clustering and use cosine similiraty -


can tell me how k-means clustering working on textmining.. , using cosine similarity distance metric.

nim              310910022       320910044          310910043           310910021 access               0               2                  3                   5 abdi                 1               0                  0                   0  actual               5               0                  0                   1 arrow                 0               1                 1                   2 

this data on listview

how can in vb.net? cluster , trending topic of term ?

thank in advance

first separate problem 2 parts:

  1. computing k-means clustering assignments
  2. getting data gui (you mention data on listview)

i assume 2 straight forward , need on 1.

i start re-writing code read tsv text file of data specified. make things lot easier debug.

then ask if want implement kmeans algorithm or use library. if want implement it, here link pseudo code http://www.scribd.com/doc/89373376/k-means-pseudocode can search other kmeans pseudocode.

if want use library "run" data against kmeans algorithm, here 1 example in python/scipy. http://glowingpython.blogspot.com/2012/04/k-means-clustering-with-scipy.html

regardless of approach use, realize kmeans non-deterministic , may different answers everytime run it. recommend computing against known validation set see if data think should be.


Comments