Title: CATPAC
 1CATPAC WordStat
- Anne D. Sito 
 -  
 - Erin Sonenstein 
 - COM 633 FA 09
 
  2CATPAC 
 3Overview of CATPAC 
- Designed to recognize frequently used words in 
text  -  Identifies and groups patterns of similar words 
 - Provides output of clustering algorithms, 
perceptual maps, and interactive clustering  
  4Data Preparation Text 
 51. Convert document into .txt file 
 62. Inputting Data 
 73. Select Text File You Want to Analyze 
 84. Select Make Dendrogram 
 95. Initial Output Screen 
 106. Output Data Screen 
 117. Output Dendrogram 
 128. Data Presented in ThoughtView 2D 
 139. Data Presented in ThoughtView 3D 
 1410. Thought View 3D (Rotated) 
 15Discussion and Limitations
- s 
 - Found words like you, youll,  and to be 
the most used in this text.  - Examines relationships between words based on 
proximity in the text.  - -s 
 - Words are measured based on frequency, not 
importance.  - Focuses less on what words mean or how they fit 
together based on dictionaries. 
  16WordStat  http//www.provalisresearch.com/wordsta
t/wordstat.html 
 17Overview of WordStat 
- Content Analysis Module for SIMSTAT 
 - Specifically designed to process textual 
information geared for open-ended data which 
includes journal articles, speeches, electronic 
communication, interviews, etc.  - Has existing dictionary library and can also run 
analyses from new dictionaries built by the user  -  Can perform statistical analyses (i.e., factor 
analysis, word frequencies, multiple regression, 
etc.)  - KWIC Key Word In Context tables are available 
for any included or not included word or word 
pattern 
  18Data Comparing Reviews of the Book on Amazon.com 
Between Men and Women 
 191. Create a Text File 
 202. Input Text File to WordStat 
 213. Define Your Variables 
 224. Running the Analysis 
 235. Existing Dictionary Was Not Relevant for 
Our Data 
 246. New Dictionary Available Online! 
 257. (Free) New Dictionary Download 
 268. Import New Dictionary Maintain Exclusion List 
 279. Level 1 Analysis 
 2810. Level 2 Analysis 
 2911. Overall Frequencies 
 3012. Gender Differences 
 3113. Dendrogram 
 3214. Clustering 
 3315. 3-D Figure of Output 
 3416. Concurrence Matrix 
 3517. KWIC by Gender 
 3618. Words by each Text Case 
 3719. Word Count Category Frequency 
 3820. Aggression Example 
 3921. Limitations TerrificAnxiety? 
 40Discussion  Limitations
- Allows multiple independent variables 
 - Dictionaries may not always be complete 
 - Words in .txt file must be be spelled correctly 
 - Could not distinguish between quotes from the 
book and original thoughts  - May not account for different usage of certain 
words, (e.g., combating, terrific)  
  41Any Questions? Thank You!