Part Two: Using Xaira to explore corpora - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Part Two: Using Xaira to explore corpora

Description:

Part Two: Using Xaira to explore corpora Richard Xiao z.xiao_at_lancaster.ac.uk – PowerPoint PPT presentation

Number of Views:99
Avg rating:3.0/5.0
Slides: 19
Provided by: Richard1981
Category:
Tags: client | corpora | explore | part | thin | two | using | xaira

less

Transcript and Presenter's Notes

Title: Part Two: Using Xaira to explore corpora


1
Part TwoUsing Xaira to explore corpora
  • Richard Xiao
  • z.xiao_at_lancaster.ac.uk

2
Outline of the talk
  • Concordance
  • Wordlist
  • Keywords (No)
  • Output formats
  • Manipulating results
  • Collocation/colligation
  • Distribution analysis
  • Live demonstration
  • Tips for keeping away from bugs
  • Multilingual dimension
  • Xaira FAQs

3
Concordance
  • Word query ( )
  • Search for a word
  • Phrase/Quick query ( or
    )
  • Searching for a word or phrase
  • Addkey query ( )
  • POS or lemma search
  • Pattern query ( )
  • Regular Expression search
  • XML query ( )
  • Search for XML markup
  • CQL/XQL query ( )
  • Searching using XML-based Corpus Query Language
  • Query builder ( )
  • A powerful combination of all query types

4
Wordlist
  • In Client gtgt Word query (up to 100,000 lexicon
    entries) sorting alphabetically, by frequency,
    or the number of forms
  • In Xaira Indexer Tools gtgt Tools gtgt Indexer gtgt
    Options gtgt Create frequency table

5
Keyword?
  • Sadly, no
  • Use WordSmith instead
  • WordSmith version 4.0 fully supports Unicode

6
Output formats
  • Page mode vs. Line mode (KWIC)
  • Plain text vs. XML text
  • Scope of context
  • Alignment (left, right, top, bottom)
  • Reference (on the status bar)

7
Manipulating results
  • Edit query (to save time for related queries)
  • Bibliographical data
  • Sort KWIC concordances
  • Select/block select/copy concordances
  • Right click on a concordance
  • Thin/edit concordances
  • Random sampling
  • Save queries and export them in XML
  • Print results

8
Collocation/colligation ( )
  • Statistical measure (MI or Z)
  • Window span
  • Minimum frequency
  • Minimum MI/Z score
  • Top N collocates
  • Computing collocation statistics for individual
    words
  • Applying selected lemmata
  • Colligation (Addkey tags)

9
Distribution analysis ( )
  • Defining partition (subcorpora)
  • (Texts gtgt Column control to select XML tags)
  • Texts gtgt Define partition (3 ways)
  • Based on selected class, values in a column, or
    solutions to a query
  • Texts gtgt Open partition
  • Tabulation (text class, words, hits, , etc)
  • Normalised frequencies for subcorpora
  • Sorting tabulated data
  • Graphic presentation (pie/bar chart)
  • Save distribution data in various forms
  • Copy pie/bar chart

10
Additional features of Xaira
  • Annotating concordances (making notes)
  • Copying query text or notes
  • User-defined stylesheet
  • Colour book (e.g. different colours for different
    POS categories)
  • Remote access over a network
  • Platform-independent

11
Xaira live demonstration
  • Here we go
  • slides to follow

12
Tips for keeping away from bugs
  • In the Line mode, a maximum of 1,524 concordances
    are displayed
  • See the rest in the Page mode
  • In Query builder, joining query nodes in the
    horizontal direction (OR) and then in the
    vertical direction (AND) may produce unreliable
    counts when the Link type is specified as
    One-way or Two-way
  • Only define Link type as Next or Not next
  • If thousands of hits are downloaded and dozens of
    them are deleted by reverse selection in
    thinning, the system may crash
  • If concordances have been sorted/edited, a saved
    query may not be opened again
  • Save the edited concordances as an XML list using
    Query Listing in the menu or pressing on
    the toolbar

13
Truly multilingual - Chinese
14
Truly multilingual - Bengali
15
Truly multilingual - Hindi
16
Truly multilingual - Punjabi
17
Truly multilingual - Urdu
18
Xaira FAQs
  • Is Xaira free and where can I get it?
  • Yes, it is absolutely free. You can get a copy
    (binary for Windows, and source codes for
    compilation on the Unix/Linux/Mac system) at the
    SourceForce website. The latest release is 116.
    http//sourceforge.net/project/showfiles.php?group
    _id130289
  • Where can I get more documentation?
  • In addition to the built-in help file, more
    documentation is available at the Xaira site
    http//www.oucs.ox.ac.uk/rts/xaira/
  • Where can I get technical help?
  • You can sign up for the Xaira Preview List to get
    help http//www.tei-c.org.uk/tei-bin/betatest
  • For a critical review, see http//www.lancs.ac.uk/
    postgrad/xiaoz/papers/xaira_review.pdf
Write a Comment
User Comments (0)
About PowerShow.com