Greeting Text Analysis

This post is a late work for week 3’s blog. I didn’t know what to write last week because I wasn’t sure about the homework topics, but now, here it goes.

What is Text Analysis?

In my own words, Text Anaysis is a process that refines the most important information from the text. It is also called “Text mining”, which refers to “deriving the high-quality information from the text”.

Wikipedia says (I know it’s not a scholarly resource, but let’s hear what it says), and I quote, “The term text analytics describes a set of linguistic, statistical, and machine learning techniques that model and structure the information content of textual sources for business intelligence, exploratory data analysis, research, or investigation.”

Basic Text Summaries and Analyses

  1. Word frequency (lists of words and their frequencies) (See also: Word counts are amazing, Ted Underwood)
  2. Collocation (words commonly appearing near each other)
  3. Concordance (the contexts of a given word or set of words)
  4. N-grams (common two-, three-, etc.- word phrases)
  5. Entity recognition (identifying names, places, time periods, etc.)
  6. Dictionary tagging (locating a specific set of words in the texts)

High-level Goals for Text Analysis

(From Underwood, T. (2012). Where to start with text mining.)

  1. Document categorization: a. Information retrieval (e.g., search engines) b. Supervised classification (e.g., guessing genres) c. Unsupervised clustering (e.g., alternative “genres”)
  2. Corpora comparison (e.g., political speeches)
  3. Language use over time (e.g., Google ngram viewer)
  4. Detecting clusters of document features (i.e., topic modeling)
  5. Entity recognition/extraction (e.g., geoparsing)
  6. Visualization

I think in class experince helped me a lot about understanding what is “text analysis” and how to do it (by comparing the two poems and tring to figure out what the author was trying to tell us from the article) I was confused by the term, but know I feel like I’m starting to get used to it, and it somehow become fun to me.

Reference

LibGuides: Introduction to Text Analysis: About Text Analysis. (n.d.). Retrieved September 16, 2016, from http://guides.library.duke.edu/c.php?g=289707

Written on September 16, 2016 by Meiqi Zhao