Linguistic Network - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Linguistic Network

Description:

Gone with the Wind (Margaret Mitchell) Crime and Punishment (Fyodor Dostoyevsky) ... The Prince. A Tale of Two Cities. Gone with the Wind. Crime and Punishment ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 21
Provided by: anthonys
Category:
Tags: gone | linguistic | network | the | wind | with

less

Transcript and Presenter's Notes

Title: Linguistic Network


1
Linguistic Network
  • Word Adjacency
  • Anthony Strathman
  • Yilin Wu

2
What are linguistic networks?
  • Introduction

3
Motivation
  • The study of word connections is important for
  • The structure and evolution of a language.
  • Cognitive science the transition from
    non-syntactic to syntactic communication involved
    the co-evolution of language and brain.

4
History of Studies
  • Zipfs law the frequency of words follows
    power-law distribution with respect to the rank
    in the frequency table.
  • (Cf http//en.wikipedia.org/wiki/Zipf's_law)
  • It does not involve the interactions between
    words. So does not provide deep insight to the
    organization of a language.

5
Network Approach
  • A convenient tool to study the interaction of
    words, thus providing the organization and
    dynamics of a language
  • People have study word network from various
    perspectives (synonym, polysemy, word
    cooccurrence, etc. ). Most results show small
    world and scale-free properties of language. (see
    references)
  • For example, in the word cooccurrence studies,
    Cancho and Sole (2001) showed that (1).the
    average distance between any two two words is d
    2 3 (2).a scale-free distribution of degrees.

6
Word Adjacency Network
  • Project Description

7
Network Definition
  • We take our nodes to be individual words
  • The nodes are said to be linked if they appear
    adjacent to each other in a sentence
  • All links are taken to be bidirectional

Excerpt from The Prince Such dominions thus
acquired are either accustomed to live under a
prince, or to live in freedom and are acquired
either by the arms of the prince himself, or of
others, or else by fortune or by ability.
8
Motivation
  • Studying linguistics networks can give insight
    into the structure of natural language
  • Looking for any structural drift over time

9
Source Material
  • Four different texts
  • The Prince (Niccolo Machiavelli)
  • A Tale of Two Cities (Charles Dickens)
  • Gone with the Wind (Margaret Mitchell)
  • Crime and Punishment (Fyodor Dostoyevsky)

10
The Resultant Networks
  • Results

11
Example Network
  • Trimmed version of network from The Prince

12
General Characteristics
13
Degree Distribution
14
Clustering
15
Clustering
16
Clustering
17
Assortativity
18
Shortest Path Length and Diameter
  • The measured diameter of all four networks was 2
  • The distribution of path lengths is remarkably
    narrow (1.99 with standard deviation of around
    0.01)
  • Given the difference in size of these networks,
    (ranging from 5000 to 20000 nodes) we see no
    change in diameter with size.

19
Conclusions
  • The word adjacency networks of all four sources
    are very similar in structure
  • By this measure, grammatical structure seems
    fairly stable in time
  • The structure seems to be independent of the
    length of the text.

20
References
  • Dorogovtsev, S. N. and Mendes, J. F. F., Language
    as an evolving word web, Proc. R. Soc. London B
    268, 26032606 (2001).
  • Ferrer i Cancho, R. and Sole, R. V., The small
    world of human language, Proc. R. Soc. London B
    268, 22612265 (2001).
  • Sigman, M. and Cecchi, G. A., Global organization
    of the Wordnet lexicon, Proc. Natl. Acad. Sci.
    USA 99,17421747 (2002).
  • Steyvers, M. and Tenenbaum, J. B., The largescale
    structure of semantic networks Statistical
    analyses and a model for semantic growth,
    Preprint cond-mat/0110012 (2001).
  • Corominas, B. and Sole, R. V. (2006) Network
    topology and self-consistency in language games.
    Journal of Theoretical Biology, 241(2)438--441.
  • Motter, A. E., de Moura, A. P. S., Lai, Y-C., and
    Dasgupta, P. (2002) Topology of the conceptual
    network of language. Physical Review E,
    65(065102).
  • Text samples are provided from the database
    search
  • http//onlinebooks.library.upenn.edu/search.html
  • Project Gutenberg
Write a Comment
User Comments (0)
About PowerShow.com