CSA305: Natural Language Algorithms - PowerPoint PPT Presentation

About This Presentation
Title:

CSA305: Natural Language Algorithms

Description:

Start in the initial state and at the first symbol of the word. ... Lookahead: peek ahead n symbols in the input in order to decide which path to take. ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 21
Provided by: michael307
Category:

less

Transcript and Presenter's Notes

Title: CSA305: Natural Language Algorithms


1
CSA305 Natural Language Algorithms
  • Deterministic and Non Deterministic Recognition

2
Acknowledgement
  • Material presented adapted fromJurafsky and
    Martin Ch 2

3
Representation of Automata using Transition Tables
4
Transition Table Representation in Prolog
  • S a b !
  • s(0,1,0,0).
  • s(1,0,2,0).
  • s(2,0,3,0).
  • s(3,0,3,4).
  • s(4,0,0,0).
  • next(OldState,a,NewState) -
  • s(OldState,NewState,_,_).
  • next(OldState,b,NewState) -
  • s(OldState,_,NewState,_).
  • next(OldState,!,NewState) -
  • s(OldState,_,_,NewState).

5
A Better Representation
  • s(0,b,1).
  • s(1,a,2).
  • s(2,a,3).
  • s(3,a,3).
  • s(3,!,4).
  • next(OldState,Sym,NewState) -
  • s(OldState,Sym,NewState).

6
The Process of Recognition 1
  • Start in the initial state and at the first
    symbol of the word.
  • If there is an arc labelled with that symbol, the
    machine transitions to the next state, and the
    symbol is consumed.
  • The process continues with successive symbols
    until ....

7
The Process of Recognition 2
  • One or more of these conditions holds
  • A. All symbols in the input are consumed
  • IF current state is final, succeed, else fail
  • B. There are no transitions out of a state for
    the current symbol.
  • fail

8
Deterministic Recognition
  • A deterministic algorithm is one that has no
    choice points
  • The following algorithm takes as input a tape and
    an automaton.
  • returns accept else reject

9
DETERMINISTIC FSA RECOGNITION
10
Skeleton of Prolog Implementation
  • drec(Tape,Machine,State,Result).
  • drec( , M, S, yes) -
  • final(S).
  • drec(HT, M, S, Result) -
  • tran(M,S,H,N),
  • drec(T,M,N,Result).
  • drec(_,_,_,no).

11
Failure States
  • We can regard failure as a special state.
  • That state is reached by adding supplementary
    arcs that represent invalid input.

12
Adding a Failure State
13
Deterministic versus Non Deterministic
Recognition.
  • The behaviour of the automata we have considered
    is fully determined by the current state, and the
    input symbol.
  • The recognition process is said to be
    deterministic
  • This is not necessarily the case.
  • Several arcs with the same label.
  • ?-Transitions. Arcs with no label.
  • Automata like this are called non-determinstic

14
Non Deterministic FAs
15
Non Deterministic Recognition
  • There are three ways of dealing with
    non-deterministic recognition
  • Backtracking at every choice point, record the
    state and as yet unexplored choices.
  • Lookahead peek ahead n symbols in the input in
    order to decide which path to take.
  • Parallel search look at every path in parallel.

16
ND-RECOGNISE
function ND-RECOGNISE(tape,machine) returns
accept or reject agenda ? (q0(machine),0)
search_state ? NEXT(agenda) loop if
ACCEPT-STATE?(search_state) true then return
accept else agenda ? agenda ? GENERATE-NEW-STATES(
search_state) if agenda is empty then return
reject else current_state ? NEXT(agenda) end
17
ACCEPT-STATE?
function ACCEPT-STATES?(search_state) mstate ?
first(search_state) tape_pos ? second(search_state
) if tapetape_pos end_input and
IS-FINAL?(mstate) then return true else return
false
18
GENERATE-NEW-STATES
function GENERATE-NEW-STATES(search_state) mstate
? first(search_state) tape_pos ?
second(search_state) return (x,tape_pos)
xtrantablemstate,? ? (x, tape_pos 1)
trantablemstate, tapetape_pos
19
Recognition as Search
  • Recognition can be regarded as a search problem
  • Initial state, Goal State
  • Rules
  • Strategy
  • Different search behaviours (depth first, breadth
    first) can be evoked by managing the agenda in
    different ways.
  • See Jurafsky Martin sect 2.2

20
Deterministic and Non Deterministic FSAs
  • The class of languages recognisable by NDFSA is
    identical to that recognised by DFSA.
  • For every NDFSA ND there is an equivalent FSA D.
  • The states of D correspond to sets of states in
    ND
  • If N is the number of states in ND, the number of
    states in D is 2N
Write a Comment
User Comments (0)
About PowerShow.com