COGN1001: Introduction to Cognitive Science Topics in Computer Science Formal Languages and Models of Computation - PowerPoint PPT Presentation

About This Presentation
Title:

COGN1001: Introduction to Cognitive Science Topics in Computer Science Formal Languages and Models of Computation

Description:

COGN1001: Introduction to Cognitive Science. Topics in Computer Science. Formal Languages. and. Models of Computation. Qiang HUO. Department of Computer Science. The ... – PowerPoint PPT presentation

Number of Views:167
Avg rating:3.0/5.0
Slides: 52
Provided by: qian6
Category:

less

Transcript and Presenter's Notes

Title: COGN1001: Introduction to Cognitive Science Topics in Computer Science Formal Languages and Models of Computation


1
COGN1001 Introduction to Cognitive
ScienceTopics in Computer Science Formal
Languages and Models of Computation
  • Qiang HUO
  • Department of Computer Science
  • The University of Hong Kong
  • (E-mail qhuo_at_cs.hku.hk)

2
Outline
  • What is a Formal Language?
  • Phrase-Structure Grammars
  • Finite State Automata
  • Formal languages and Models of Computation

3
Natural Language vs. Formal Language
  • Natural language
  • written and/or spoken languages in the world,
    such as Chinese, English, Japanese, German,
    French, Spanish, etc.
  • Syntax
  • Semantics
  • Formal language
  • a language specified by a well-defined set of
    rules of syntax.
  • A study of formal languages is important to
    computer science.
  • For example, we need to understand what kind of
    statements are acceptable in the C programming
    language. This is the task of a compiler of a
    programming language.

4
Formal Language
  • We will describe the sentences of a formal
    language using a grammar.
  • How can we determine whether a combination of
    words is a valid sentence in a formal language?
  • How can we generate the valid sentences of a
    formal language?
  • We will only be interested in the syntax, not the
    semantics (meaning), of a language.

5
  1. a sentence is made up of a noun-phrase followed
    by a verb-phrase
  2. a noun-phrase is made up of an article followed
    by an adjective followed by a noun, or
  3. a noun-phrase is made up of an article followed
    by a noun
  4. a verb-phrase is made up of a verb followed by an
    adverb, or
  5. a verb-phrase is made up of a verb
  6. an article is a, or
  7. an article is the
  8. an adjective is large,
  9. an adjective is hungry
  10. a noun is rabbit, or
  11. a noun is mathematician
  12. a verb is eats, or
  13. a verb is hops
  14. an adverb is quickly, or
  15. an adverb is wildly.

If we define a subset of English using the list
of rules shown here that describe how a valid
sentence can be produced, how the language looks
like?
6
Example a Subset of English
  • From the previous rules we can form valid
    sentences using a series of replacements until no
    more rules can be used.
  • For instance, the valid sentence the large rabbit
    hops quickly can be obtained by the following
    sequence of replacements
  • sentence
  • noun-phrase verb-phrase
  • article adjective noun verb-phrase
  • article adjective noun verb adverb
  • the adjective noun verb adverb
  • the large noun verb adverb
  • the large rabbit verb adverb
  • the large rabbit hops adverb
  • the large rabbit hops quickly
  • Some other valid sentences
  • a hungry mathematician eats wildly
  • the rabbit eats quickly
  • An invalid sentence the quickly eats
    mathematician

7
Some Terminologies
  • A vocabulary (or alphabet) V is a finite,
    nonempty set of elements called symbols.
  • A word (or sentence) over V is a string of finite
    length of elements of V .
  • The empty string or null string, denoted by ?, is
    the string containing no symbols.
  • The set of all words (or sentences) over V is
    denoted by V.
  • A language over V is a subset of V .
  • Example In English,
  • The alphabet V consists of English letters and
    other symbols.
  • A word (or sentence) over V is a finite string of
    symbols.
  • The meaningful word (or sentence) of English is a
    subset of V .

8
How to specify a language?
  • to list all the words (or sentences) in the
    language or
  • to give some criteria that a word (or a sentence)
    must satisfy to be in the language or
  • to specify a language through the use of a
    grammar, such as the set of rules we gave in the
    previous example of English subset.

9
Outline
  • What is a Formal Language?
  • Phrase-Structure Grammars
  • Finite State Automata
  • Formal languages and Models of Computation

10
What is a Phrase-Structure Grammar?
  • A phrase-structure grammar is G (V,T,S,P),
    where
  • V is a vocabulary
  • T is a subset of V consisting of terminal
    elements (i.e., the elements of V which can not
    be replaced by other symbols)
  • The elements of N VT are called nonterminal
    symbols (i.e., the elements of V which can be
    replaced by other symbols)
  • S is a start symbol from V (i.e., the element of
    the V that we always begin with
  • P is a set of productions.
  • We denote by w0?w1 the production that specifies
    that w0 can be replaced by w1.
  • Every production in P must contain at least one
    nonterminal on its left side.

11
Example a Phrase-Structure Grammar
  • G (V,T,S,P), where
  • V a, the, large, hungry, rabbit,
    mathematician, eats, hops, quickly,
    wildly sentence, noun-phrase,
    verb-phrase, article, adjective, noun, verb,
  • adverb
  • T a, the, large, hungry, rabbit,
    mathematician, eats, hops, quickly, wildly
  • VT sentence, noun-phrase, verb-phrase,
    article, adjective, noun, verb, adverb
  • S sentence
  • Production rules P

12
  • P
  • sentence ? noun-phrase verb-phrase,
  • noun-phrase ? article adjective noun,
  • noun-phrase ? article noun,
  • verb-phrase ? verb adverb,
  • verb-phrase ? verb,
  • article ? a,
  • article ? the,
  • adjective ? large,
  • adjective ? hungry,
  • noun ? rabbit,
  • noun ? mathematician,
  • verb ? eats,
  • verb ? hops,
  • adverb ? quickly,
  • adverb ? wildly

13
Some Terminologies
  • Let G (V,T,S,P) be a phrase-structure
    grammar.
  • Let w0 lz0r and w1 lz1r be strings over V
    .
  • If z0 ? z1 is a production of G, we say that w1
    is directly derivable from w0 and we write w0?w1.
  • Example
  • the adjective noun verb adverb ? the large
    noun verb adverb because
  • adjective ? large
  • If w0,w1, ,wn, n ? 0, are strings over V such
    that
  • w0?w1, w1?w2, ,wn-1?wn, then
  • we say that wn is derivable from w0, and
  • we write w0 ?wn.
  • The sequence of steps used to obtain wn from w0
    is called a derivation.

14
  • Example sentence ? the large rabbit hops quickly
  • via the following derivation
  • sentence ? noun-phrase verb-phrase,
  • noun-phrase verb-phrase ? article adjective noun
    verb-phrase,
  • article adjective noun verb-phrase ? article
    adjective noun verb
  • adverb,
  • article adjective noun verb adverb ? the
    adjective noun verb adverb,
  • the adjective noun verb adverb ? the large noun
    verb adverb,
  • the large noun verb adverb ? the large rabbit
    verb adverb,
  • the large rabbit verb adverb ? the large rabbit
    hops adverb,
  • the large rabbit hops adverb ? the large rabbit
    hops quickly.

15
What is the language generated by a
Phrase-Structure Grammar?
  • Let G (V,T,S,P) be a phrase-structure grammar.
  • The language generated by G (or the language of
    G), denoted by L(G), is the set of all strings of
    terminals that are derivable from the starting
    symbol S.
  • L(G) w ?T S?w

16
  • Example Suppose G (V,T,S,P), where V
    a,b,A,B,S,
  • T a,b, S is the start symbol, and
  • P S?ABa, A?BB, B?ab, AB?b .
  • All the sentences" (words) generated by this
    grammar are
  • abababa, ba , since
  • S ? ABa ? BBBa ? abababa
  • S ? ABa ? ba
  • Example Let G be the grammar with V S,0,1, T
    0,1,
  • starting symbol S, and production rules
  • P S?11S, S?0 .
  • L(G) (11)n 0 n 0,1,2, .

17
How to construct a grammar that generates a given
language?
  • Example
  • Find a phrase-structure grammar to generate
    the set
  • 0n1n n 0,1,2,
  • Solution
  • G (V,T,S,P), where
  • V S, 0, 1 ,
  • T 0,1 ,
  • S is the start symbol, and
  • P S?0S1,S?? .

18
How to construct a grammar that generates a given
language??
  • Example Find a phrase-structure grammar to
    generate the set 0m1n m,n 0,1,2,
  • Solution 1 G1 (V,T,S,P), where
  • V S,0,1, T 0,1, S is the start symbol,
    and
  • P S?0S, S?S1, S??
  • Solution 2 G2 (V,T,S,P), where
  • V S,A,0,1, T 0,1, S is the start
    symbol, and
  • P S?0S, S?1A, S?1, A?1A, A?1, S??
  • ?

Two grammars can generate the same language!
19
How to construct a grammar that generates a given
language???
  • There are many techniques from the theory of
    computation which can be used to systematically
    construct a grammar for a given formal language,
    but
  • This is beyond the scope of this course.

20
Types of Phrase-Structure Grammars (1)
  • Phrase-structure grammars can be classified
    according to the types of productions that are
    allowed.
  • Such a classification scheme introduced by Noam
    Chomsky is as follows
  • Type 0 grammar has no restrictions on its
    production.
  • Type 1, or context-sensitive, grammar can have
    productions only of the form
  • w1 ? w2, where l(w1) ? l(w2), or of the form
  • w1 ? ?.
  • Type 2, or context-free grammar can have
    productions only of the form
  • A? w2, where A is a nonterminal symbol.

21
Types of Phrase-Structure Grammars (2)
  • Type 3, or regular grammar can have productions
    only of the form
  • A ? aB,
  • A ? a,
  • S ? ? ,
  • where A and B are nonterminal symbols, S is
    the start symbol, and a is a terminal symbol.
  • A language generated by a
  • type 1 grammar is called a context-sensitive
    language
  • type 2 grammar is called a context-free
    language
  • type 3 grammar is called a regular language.

22
Examples
  • 0m1n m,n 0,1,2, is a regular language,
    since it can be generated by a regular grammar G
    with P
  • P S?0S, S?1A, S?1, A?1A, A?1, S??
  • 0n1n n 0,1,2, is a context-free
    language, since it can be generated by a
    context-free grammar G with P
  • P S?0S1, S??
  • 0n1n2n n 0,1,2, is a context-sensitive
    language, since it can be generated by a type 1
    grammar
  • G (V,T,S,P) with V 0,1,2,S,A,B, T
    0,1,2, starting symbol S, and productions
  • P S?0SAB, S??, BA?AB, 0A?01, 1A?11, 1B?12,
    2B?22
  • but not by any type 2 grammar.

23
Example a Phrase-Structure Grammar
  • G (V,T,S,P), where
  • V a, the, large, hungry, rabbit,
    mathematician, eats, hops, quickly,
    wildly sentence, noun-phrase,
    verb-phrase, article, adjective, noun, verb,
  • adverb
  • T a, the, large, hungry, rabbit,
    mathematician, eats, hops, quickly, wildly
  • VT sentence, noun-phrase, verb-phrase,
    article, adjective, noun, verb, adverb
  • S sentence
  • Production rules P

24
  • P
  • sentence ? noun-phrase verb-phrase,
  • noun-phrase ? article adjective noun,
  • noun-phrase ? article noun,
  • verb-phrase ? verb adverb,
  • verb-phrase ? verb,
  • article ? a,
  • article ? the,
  • adjective ? large,
  • adjective ? hungry,
  • noun ? rabbit,
  • noun ? mathematician,
  • verb ? eats,
  • verb ? hops,
  • adverb ? quickly,
  • adverb ? wildly

25
Example Backus-Naur Form
  • What is the Backus-Naur Form of the grammar for a
    subset of English described before?
  • ltsentencegt ltnoun phrasegtltverb phrasegt
  • ltnoun phrasegt ltarticlegtltadjectivegtltnoungtltart
    iclegtltnoungt
  • ltverb phrasegt ltverbgtltadverbgtltverbgt
  • ltarticlegt a the
  • ltadjectivegt large hungry
  • ltnoungt rabbit mathematician
  • ltverbgt eats hops
  • ltadverbgt quickly wildly

26
What is Backus-Naur Form (BNF)?
  • There is another notation that is used to specify
    a type 2 (context-free) grammar, called the
    Backus-Naur Form
  • all productions having the same nonterminal as
    their left-hand side are combined with the
    different right-hand sides of these productions,
    each separated by a bar ( ), with
  • nonterminal symbols enclosed in angular brackets
    (ltgt), and
  • the symbol ? replaced by
  • Example The Backus-Naur form for a grammar that
    produces signed integers is as follows
  • ltsigned integergt ltsigngtltintegergt
  • ltsigngt -
  • ltintegergt ltdigitgtltdigitgtltintegergt
  • ltdigitgt 0123456789

27
What is a Derivation (or Parse) Tree?
  • A derivation in the language generated by a
    context-free grammar can be represented
    graphically using an ordered rooted tree, called
    a derivation (or parse) tree
  • the root represents the starting symbol,
  • internal vertices represent nonterminals,
  • leaves represent terminals, and
  • the children of a vertex are the symbols on the
    right side of a production, in order from left to
    right, where the symbol represented by the parent
    is on the left-hand side.

28
Example
  • Construct a derivation tree for the derivation of
    the sentence, the hungry rabbit eats quickly,
    discussed previously.

29
How to determine whether a string is in the
language generated by a context-free grammar?
  • Top-down parsing
  • begins with the starting symbol and proceeds by
    successively applying productions to see if the
    given string can be derived.
  • Bottom-up parsing
  • work backwards.

30
  • Example Determine whether the word cbab belongs
    to the L(G), where, G (V,T,S,P) with
  • V a,b,c,A,B,C,S,
  • T a,b,c,
  • S is the starting symbol, and the productions
    are
  • S ? AB
  • A ? Ca
  • B ? Ba
  • B ? Cb
  • B ? b
  • C ? cb
  • C ? b
  • Top-down parsing
  • S ? AB
  • S ? AB ? CaB
  • S ? AB ? CaB ? cbaB
  • S ? AB ? CaB ? cbaB ? cbab
  • Bottom-up parsing
  • Cab ? cbab
  • Ab ? Cab ? cbab
  • AB ? Ab ? Cab ? cbab
  • S ? AB ? Ab ? Cab ? cbab

31
Outline
  • What is a Formal Language?
  • Phrase-Structure Grammars
  • Finite State Automata
  • Formal languages and Models of Computation

32
Finite State Machines with No Output
  • Finite-state machines with no output are also
    called finite-state automata.
  • Finite-state automata do not generate output. But
    they have a set of special states, called final
    states.
  • A finite-state automaton is often used for
    language recognition.
  • This application plays a fundamental role in the
    design and construction of compliers for
    programming languages.

33
What is a Deterministic Finite-State Automaton?
  • A finite-state automaton M (S,I,f,s0,F)
    consists of
  • a finite set S of states,
  • a finite input alphabet I,
  • a transition function f that assigns a state to
    every pair of state and input,
  • an initial state s0, and
  • a subset F of S consisting of final states.

34
How to represent a Finite-State Automaton?
  • We can represent a finite-state automaton using
    either a state table or a state diagram. Final
    states are indicated in the state diagram by
    using double circles.
  • What is the state table of the above finite-state
    automaton?

35
What is the language recognized by a given
Finite-State Automaton?
  • An input string is recognized or accepted by an
    automaton M if the string takes the automaton to
    one of its final states.
  • The language recognized by an automaton M,
    denoted by L(M), is the set of all strings that
    are recognized by M.

The language recognized by the above finite-state
automaton M is L(M) 0n,0n10x n0,1,2, ,
and x is any string .
36
Deterministic vs Nondeterministic Finite-State
Automata
  • The finite-state automata discussed so far are
    deterministic, since for each pair of state and
    input value there is a unique next state given by
    the transition function.
  • There is another important type of finite-state
    automaton in which there may be several possible
    next states for each pair of state and input
    value.
  • Such machines are called nondeterministic.
  • Nondeterministic finite-state automata are
    important in determining which languages can be
    recognized by a finite-state automaton.

37
What is a Nondeterministic Finite-State Automaton?
  • A nondeterministic finite-state automaton
  • M (S,I,f,s0,F) consists of
  • a finite set S of states,
  • a finite input alphabet I,
  • a transition function f that assigns a set of
    states to each pair of state and input,
  • an initial state s0, and
  • a subset F of S consisting of final states.

38
How to represent a Nondeterministic Finite-State
Automaton?
  • Using a state table for each pair of state and
    input value we give a list of possible next
    states.
  • Using a state diagram include an edge from each
    state to all possible next states, labelling
    edges with the input(s) that lead to this
    transition.

39
What is the language recognized by a given
Nondeterministic Finite-State Automaton?
  • What does it mean for a nondeterministic
    finite-state automaton to recognize a string x
    x1x2 xk?
  • x1 takes the starting state s0 to a set S1 of
    states
  • x2 takes each of the states in S1 to a set of
    states.
  • Let S2 be the union of these sets
  • Continue this process, including at a stage all
    states that can be obtained using
  • a state obtained at the previous stage and
  • the current input symbol
  • The string x is recognized or accepted if there
    is a final state in the set of all states that
    can be obtained from s0 using x.
  • The language recognized by a nondeterministic
    finite-state automaton is the set of all strings
    recognized by this automaton.

40
Example
  • Determine the language recognized by the
    nondeterministic finite-state automaton M shown
    in the following figure.
  • Solution L(M) 0n, 0n01, 0n11 n0,1 ,2,
    .

41
An Important Fact
  • Theorem
  • If the language L is recognized by a
    nondeterministic finite-state automaton M0, then
    L is also recognized by a deterministic
    finite-state automaton M1.
  • Two finite-state automata are called equivalent
    if they recognize the same language.

42
Outline
  • What is a Formal Language?
  • Phrase-Structure Grammars
  • Finite State Automata
  • Formal languages and Models of Computation

43
Build an FSA from a Regular Grammar
  • Suppose that G (V,T,S,P) is a regular grammar
    generating the set L(G), where each production is
    of the form
  • S ? ? , A ? a, or A ? aB, with a being a
    terminal symbol, A and B are nonterminal symbols.
  • We can build a nondeterministic finite-state
    machine
  • M (S,I,f,s0,F) that recognizes L(G).

44
  • M (S,I,f,s0,F)
  • S contains a state sA for each nonterminal
    symbol A of G, and an additional final state sF
  • The start state s0 is the state formed from the
    start symbol S
  • A transition from sA to sF on input of a is
    included if
  • A ? a is a production
  • A transition from sA to sB on input of a is
    included if
  • A ? aB is a production
  • s0 will also be a final state if S ? ? is a
    production.
  • It can be shown that L(M) L(G).

45
Example
  • Construct a nondeterministic finite-state
    automaton that recognizes the language generated
    by the regular grammar G (V,T,S,P) where
  • V 0,1,A,S,
  • T 0,1, and
  • the productions in P are
  • S ?1A, S ? 0, S ? ?,
  • A ? 0A, A ? 1A, and
  • A ? 1.

46
Construct a Regular Grammar from an FSA
  • Suppose that M (S,I,f,s0,F) is a finite-state
    machine with the property that s0 is never the
    next state for a transition.
  • A regular grammar G (V,T,S,P) can be defined as
    follows
  • V is formed by assigning a symbol to each state
    of S and each input symbol in I
  • T is formed from the input symbols in I
  • S is the symbol formed from the start state s0
  • The set P of productions is formed from the
    transitions in M
  • As ? a is included if the state s goes to a final
    state under input a, where As is the nonterminal
    symbol formed from s
  • As ? aAt is included if the state s goes to t
    under input a.
  • S ? ? is included if and only if ? ? L(M).
  • It can be shown that L(G) L(M).

47
Example
  • Find a regular grammar that generates the
    language recognized by the finite-state automaton
    shown in the following figure
  • Soultion G (V,T,S,P) where
  • V S,A,B,0,1, the symbols S,A, and B
    correspond to the states S0,S1, and S2,
    respectively
  • T 0,1
  • S is the start symbol and
  • The productions are
  • S ? 0A, S ? 1B, S ? 1, S ? ?,
  • A ? 0A, A ? 1B, A ? 1,
  • B ? 0A, B ? 1B, B ? 1.

48
More Powerful Types of Machines (1)
  • The main limitation of finite-state automata is
    their finite amount of memory. This prevents them
    from recognizing languages that are not regular,
    such as 0n1nn 0,1,2,.
  • A more powerful model of computation called
    pushdown automaton can be used to recognize the
    above language.
  • Theorem A set is recognized by a pushdown
    automaton if and only if it is the language
    generated by a context-free grammar.
  • However, there are sets that cannot be expressed
    as the language generated by a context-free
    grammar. One such set is 0n1n2nn 0,1,2, .

49
More Powerful Types of Machines (2)
  • Actually, there exists an even more powerful
    machine than pushdown automata, called linear
    bounded automata which
  • can recognize context-sensitive languages such as
    the sets
  • 0n1n2n n0,1,2, but they
  • cannot recognize all the languages generated by
    phrase-structure grammars.
  • The most general model of a computing machine is
    the so-called Turing Machine which can
  • recognize all languages generated by
    phrase-structure grammars
  • model all the computations that can be performed
    on a computing machine.

50
Future Scientists vs Engineers
  • Scientists try to understand what is .
  • Engineers try to create what has never been !
  • The really great engineers have a strong
    background in science so that they thoroughly
    understand what is.
  • These special people also have to have the
    imagination to create what has never been, and
    this is what really sets them apart !
  • The methodology of engineering research
  • There exists some phenomenon of nature for which
    a model should be found
  • The mathematical analysis is just a tool that
    helps one to find this model
  • The results of any analysis should be confirmed
    by experiments.
  • Future What you make it to be !

51
Reference
  • Sections 11.1, 11.3, 11.4 of the following book
  • Kenneth H. Rosen, Discrete Mathematics and Its
    Applications, Fifth Edition, McGraw-Hill
    International Editions, 2004 or
  • The relevant sections of the above book in
    earlier editions.
Write a Comment
User Comments (0)
About PowerShow.com