RSS

Visualizing Word Frequencies with Wordle

28 Jun

Jonathan Feinberg created a nice little app to generate and edit word clouds called “Wordle”. From the Wordle website:

Wordle is a toy for generating “word clouds” from text that you provide. The clouds give greater prominence to words that appear more frequently in the source text. You can tweak your clouds with different fonts, layouts, and color schemes. The images you create with Wordle are yours to use however you like. You can print them out, or save them to the Wordle gallery to share with your friends.

Here is a sample of a word cloud of a previous Visualign Blog post (Interactive and Visual Information):

Wordle generated word cloud of a previous Visualign post.

By default, common words of the English language (“the”, “is”, “and”, etc.) are stripped out to allow focus on substantive content words. One can also exclude individual words – such as the dominant word “information” above – and tweak many options. If one could create similar word clouds from recorded speech, this might be applied to visualize certain speech patterns and perhaps cure bad habits (such as repeating “Ummm” or other fill words).

Here is another sample screen shot of the Java applet after creating the word cloud from James Taylor’s RSS feed on Enterprise Decision Management:

Wordle Java applet with word cloud. Note the prominence of PMML (Predictive Model Markup Language).

While it’s not clear how to measure the impact or value of such word cloud visualizations, it does provide a novel way to use colors, frequencies, font sizes etc. to filter, highlight, and elucidate structure in textual data – something very close to Visualign’s philosophy.

 
Leave a comment

Posted by on June 28, 2011 in Linguistic

 

Tags: , ,

Leave a comment