ISDS Test 4 Clicker Questions
What is TopCat?
A system developed by MITRE that identifies different topics in a collection of documents and displays the key "players" for each topic
What is a summarization filter? It identifies and aggregates descriptions of people from a collection of documents by means of:
An efficient syntactic analysis The use of a thesaurus Some simple natural language processing techniques
How do you think text mining techniques could be used in other businesses?
Any business that deals with individual customers, either during sales process, or has a wealth of data in its written customer communications
What were HP's challenges in text mining? How were they overcome?
Combining data from structured databases with unstructured data from text Customizing the software package's vocab for the words used at HP Bringing in text data that is not in traditional DW Using the results to find insights into customers
What is practical application of text mining?
Extracting useful info from the thousands of customer communications (such as e-mails) that HP receives
What is the Genoa Project?
Part of DARPA's total info awareness program Provides advanced tools and techniques to rapidly analyze info related to a current situation to support decision making Provides knowledge discovery tools to better "mine" relevant info sources for discovery
Comment on the future of text mining tools for counter terrorism:
Terrorists plan their activities via the internet ~ chatter Chatter is continuously monitored and extracted for analysis Text mining can analyze and filter huge volumes of data and identify plans before they are carried out
What is the motivation behind projects like Genoa?
To capture and filter info quickly and easily
What does TopCat do?
Uses association rule mining technology to identify relationships among people, organizations, locations, and events Groups these relationships and creates topic clusters which are built from 6 months of global news