Title: Using Quantum Fuzzy Logic to learn facial gestures of a Schr
1Using Quantum Fuzzy Logic to learn facial
gestures of a Schrödinger Cat puppet for Robot
Theater
- Arushi Raghuvanshi
- Prof. Marek Perkowski
- 24 May 2008
2Background Quantum Robots
Old Duck Biped
S1
S2
L2
L1
M6
M5
M3
M4
M2
Quantum Braitenberg
M1
Mr PotatoHead
(ISMVL 2007)
Schrödinger's Cat character in Interactive Robot
Theatre
3Programming Robot Behaviors
Simple sequential flow with no feedback
Behavior Selection
sound
Theatre Director
Input Initialization
Quantum or other logic controller
Measurement
Effectors
4Programming Robot Behaviors
Adding emotions and environmental feedback
Behavior Selection
sound
Theatre Director
Input Initialization
Quantum or other logic controller
Measurement
Effectors
Theatre Director
emotion
Environment including human audience
5Programming Robot Behaviors
Emotional Interactive Robots with Sensors and
Feedback Modifying the Behavior
Behavior Selection
sound
Theatre Director
Input Initialization
Quantum or other logic controller
Measurement
Effectors
Theatre Director
emotion
sensors
Environment including human audience
6Quantum Fuzzy Logic
Quantum Circuit (Can be transformed into Quantum
Fuzzy Logic, by replacing gates) NOT -gt Fuzzy
NOTOR -gt MAXAND -gt MIN
Fuzzy Logic with MIN MAX operators New
Operators and Literals can be defined for Quantum
Fuzzy Logic
7Fuzzy Logic Example
0.3
0.3
0.3
0.3
0.3
0.3
0.7
0.7
0.7
0.7
0.7
0.7
0.7
0.3
0
1
8Fuzzy Logic Operations
- Multiple ways to create Fuzzy operations
- Two examples below
- Example 1
- NOT (a) (1 a)
- e.g. NOT (0.34) 0.66
- MIN (a, b) if (a lt b) then a else b
- e.g. MIN (0.3, 0.75) 0.3
- MAX (a, b) if (a gt b) then a else b
- e.g. MAX (0.63, 0.83) 0.83
- Example 2
- NOT (a) (1 a)
- e.g. NOT (0.34) 0.66
- MIN (a, b) a b
- e.g. MIN (0.3, 0.7) 0.21
- MAX (a, b) (a b) ab
- e.g. MAX (0.3, 0.7) 0.30.7-0.21 0.79
- As in example 2, MAX and MIN may be misnomers.
They can be called OR and AND operations instead
a MAX b NOT ( NOT (a) MIN NOT (b)) NOT
((1-a)(1-b)) NOT(1-a-bab) 1-1ab-ab ab-a
b
9Representing Fuzzy Values on Bloch Sphere
- Fuzzy values can be represented in different ways
on Bloch Sphere - Simplest way to represent is along the meridian
(as shown on left) - After measurement, value can be 0, 1 or anywhere
in between - Other mechanisms (e.g. values inside the Bloch
Sphere, or parallels of latitudes etc. ) can also
be used
0
0.15
0.5
0.8
1
Measurements
10Quantum Fuzzy Literals
Rotation Around X Axis
Phase Shift (270 degree rotation around Z axis)
Rotation Around Y Axis
Z
- We use this to define the Fuzzy NOT operations
(Other literals can be used as well).
X
Y
11Quantum Fuzzy NOT operator
- Inverter is defined in exactly the same way as in
quantum logic - Fuzzy Quantum Not(a0? ß1?)?ß0? a 1?
- where the square of the (in general complex)
value associated with ket 1? is an equivalent of
fuzzy value in interval 0, 1.
12Quantum Fuzzy MIN operator
Min (a10? a21?, ß10? ß21? ) Davio
(a10? a21?, ß10? ß21?, 0)
a10? a21?
ß10? ß21?
a1ß1000? a1ß2010? a2ß1100? a2ß2111?
(a1ß100? a1ß201? a2ß110?) ? 0 ?
(a2ß211?) ?
1? gt Probability of measurement of 1
is a2ß2 ? 2
0
R (Davio)
Input is Kroenekar product of 3 parallel input
lines
?
?
a1ß1 a1ß2 a2ß1 a2ß2
?
Toffoli Gate
a1ß1 0 a1ß2 0 a2ß1 0 0 a2ß2
a1ß1 0 a1ß2 0 a2ß1 0 a2ß2 0
a1ß1 0 a1ß2 0 a2ß1 0 a2ß2 0
X
Input Matrix
Output Matrix
13Quantum Fuzzy MAX operator
- The definition of Fuzzy Quantum Maximum Operator
is calculated from De Morgan rule - A max B NOT ( NOT (A) min NOT (B)).
14Quantum Fuzzy Logic in Robots
Fuzzy Value Sensors
Light Sensors 0 completely dark 0.5
semi-dark 1 completely bright Sound
Sensors 0 pin-drop silence 0.5 normal noise
(people talking) 1 loud crash Image Sensors
Motor Controls causing output behaviors
Quantum Fuzzy Logic
15Back to Robot Theatre.
- Combination of Genetic Algorithm and Quantum
Fuzzy Logic
16Synchronizing Lips with Speech
Want This
Not This
16
17Traditional Methods
- Use mapping of phonetic symbol to a lip shape (as
shown on left) - Sound streams can be parsed to generate phonetic
symbols - The methods are language dependent (i.e.
different mapping for different language) - Need to be modified for speed and style of
speaking
18Using Genetic Algorithms
Sound Input
A
GA Engine
Initial Set of genomes representing lip
movements(initial population for GA)
Input to Fitness Function(User evaluation
interactive)
These are dynamically generated by program
B
Sequence representing Lip movements matching with
input stream A
The matching function is dynamic, so it
doesnt matter if people have different accents,
talk slower/faster, etc.
ESRA Robot
Shows Lips Movements
19Genome
- A Genome (or a chromosome) is a pattern that
corresponds to a behavior. - A possible solution to the given problem can be
encoded encoded to create a genome. - In genetic algorithms, a set of random genomes
are created. - When decoded these genomes represent possible
solutions to the given problem. - In my experiment, a genome is an encoded string
that represents a sequence of lip movements. For
example - 49__9__31__9__46_1640__
- When decoded, this code represents the lip motion
for the phrase Hi I am a robot.
20Encoding Lip Shapes for Defining the Genome
Code 0, 1 Upper 127 Lower 127
Code 5 Upper 0 Lower 0
Code 2 Upper 87 Lower 173
Code 6 Upper 0 Lower 167
Code 3 Upper 170 Lower 120
Code 7, 8 Upper 80 Lower 45
Code 4 Upper 140 Lower 56
Code 9 Upper 100 Lower 45
21Fitness Function
- The better the robot completes the problem, the
higher the fitness function. - When synchronizing sound and lip motion the
fitness function would be a user input. - To test the Genetic Algorithm, I calculated the
fitness function by comparing the genomes to the
best solution. - The best solution was determined by the
traditional method.
22Fitness Function Algorithm
1 4 9 5 7 _ 3 8
Best Genome (for calculating Fitness Score)
5 3 _ 8 3 _ 3 8
Genome Under Test
? ? ? ? ? ? ? ?
- Find Difference for each corresponding element
- Closeness implies better match (4-3 is better
than 1-5) - Pauses _ must match in position to get any
score, so it is either 0 or 9
4 1 9 3 4 0 0 0
X
9-X
5 8 0 6 5 9 9 9
Higher number is better now !
Total Score
58065999 51
Fitness Score
(Total/TotalPossible)100 51/72 100 70.83
23Selection
- The higher the fitness score, the higher the
probability of being selected. - Selection methods include the Roulette Wheel,
Tournament Selector, and Truncation Selection - In my experiment, I used a Roulette Wheel for
selection.
24Crossover
- When two chromosomes from the group are selected
they are combined to create a new genome. - Dependent on the crossover rate the bits from
each chosen genome are crossed at a randomly
chosen point. - The higher the crossover rate is, the more likely
it is that a crossover will occur. - The crossover occurs at a randomly chosen point
in the genome.
25Mutation
- Depending on the mutation rate, chosen bits of
the genome are changed. - The higher the mutation rate, the more likely it
is that a bit will be changed. - Shown to the right are many types of mutation
26Mutation
- In my experiment I used two different mutation
functions - Swap mutation
- myMutator
- I created my own mutator which changes a single
bit, rather than swapping two bits.
27Terminating Conditions
- This generational process is repeated until a
termination condition has been reached. Common
terminating conditions are - A solution is found that satisfies minimum
criteria - Fixed number of generations reached
- Allocated budget (computation time/money)
reached - The highest ranking solution's fitness is
reaching or has reached a plateau such that
successive iterations no longer produce better
results - Manual inspection
- Combinations of the above.
- I used a fixed number of generations as the
ending criteria. Default-4,000 generations I
also experimented with changing the number of
generations.
28Basic Genetic Algorithm Flow
initialize population select individuals for
mating based on Fitness Function mate
individuals to produce offspring mutate
offspring insert offspring into population are
stopping criteria satisfied? finish
29GA for Lip Synchronization
Automated Mode
Test Sound Input
Matching Sequence for Automating Fitness Fn
Evaluation
Interactive Mode
A
length
GA Engine
Initial Set of genomes representing lip
movements(initial population for GA)
Interactive Input to Fitness Function
These are dynamically generated by program
B
Sequence representing Lip movements matching with
input stream A
original sound input
In real application, input to Fitness Function is
dynamic, language independent, and it doesnt
matter if people have different accents, talk
slower/faster, etc.
ESRA Robot
Shows Lips Movements
30Genetic Algorithm Behaviors
31(No Transcript)
32(No Transcript)
33(No Transcript)
34GA Results thus far..
- Created a self-learning robot that can learn how
to synchronize sounds and words with appropriate
facial expressions. - Finding the best solution depends on different
conditions. In general, I noticed that the
functions that gave the higher objective scores
tended to take more time to complete 4,000
generations.
35Ongoing work
- Combining Quantum Fuzzy Logic to Robotic Theatre.
- Modify the body language (hand and arm movements)
based on environmental sensors - Sound Sensors (fuzzy value input) to detect noisy
or quiet environments and modify behavior - Light sensor values (fuzzy value input) to detect
day and nights and modify behavior - Quantum Fuzzy Schrödinger Cat sitting on Quantum
Fuzzy Braitenberg vehicle arguing with Einstein,
singing a song and going crazy ?.
36Cat Singing
- A lively little quantum went darting through the
air, Just as happy quanta go speeding everywhere
..
37Thank You
38Genetic Algorithms
- A genetic algorithm is a search technique used in
computing to find exact or approximate solutions
to optimization and search problems. Genetic
algorithms are a particular class of evolutionary
algorithms that use techniques such as
inheritance, mutation, selection, and crossover.
39Traditional Method(Without Genetic Algorithms)
Phonetic Letters, Punctuation, and syllables
Audio
Speech Recognition
Language Dependent
Matches input to correct lip motion Static
Sequence representing Lip movements matching with
audio input string.
Since the matching function is static, it
will have to be entirely recoded for different
people they have different accents, talk
slower/faster, etc.
ESRA Robot
Shows Lips Movements
40ESRA Robot Facial Expressions
- ESRA Robot has several motors for lips, eyelids
and arm movements - I am primarily using lip motors for my experiment
- Specific position of lip motors define the shape
of the lip - The shape can be matched with speech
Motor for Eye Lids
Motor for Upper Lip
Motor for Lower Lip
41Crossover
- Single Point Crossover
- Double Point Crossover gives any two points on
each genome an equal chance of being split up. - In my experiment, I used a single point crossover
with a 90 percent crossover rate.
42Procedure
- Create a robot with a face, a mouth, and two
motors for lip movement. - Assign shapes of the mouth for every
sound/syllable - Encode these shapes using numbers and characters
- Create a random set of genomes for a given
input. - Depending on the number of encodings that match
with the appropriate sound, a fitness function
will be assigned to each genome. - Using a Roulette Wheel, genomes will be selected
for reproduction. The higher the fitness score
the higher the probability of being selected for
reproduction. - To create a new set of offspring, one random
crossover point will be chosen for each pair of
genomes. - There will also be a 1 mutation rate.
- A new set of genomes (the offspring) are created.
- Repeat steps 5-9 for a fixed number of
generations. - Change the Genetic Algorithm parameters and
record the dependent variables.
43Program
- I used GALib from MIT lab as a library in my
program. - I designed my own genome
- Defined my fitness function
- Created an initializer function
- Created a mutator function
- Program link- Project file
- EsraGA- Main C source code
44Data
- Data Tables with swap mutator
- Data Tables with my mutator
45Abstract
- The purpose of this project is to create
efficient Genetic Algorithms for robotic learning
and the synchronization of speech and visual
expressions. This experiment uses an ESRA robot
which has a set of motors to control facial
expressions including lip motion and eyebrow
motion. Emotions can be created using facial
expressions and arm motion however, for the
simplicity of this experiment, the focus is on
lip motion. Various shapes of the mouth are
assigned to the appropriate sounds and encoded.
Using these encodings I create a random set of
chromosomes. I then use Genetic Algorithms so the
robot can develop the lip motion to correspond
with spoken text. Next, I use the Genetic
Algorithm to test how long it takes to
synchronize text and lip motion for varying
length, crossover rate, mutation rate, number of
generations, population size, and number of
offspring. Overall, I concluded that my
hypothesis was supported because using genetic
algorithms for behavioral evolution, I was able
to create a robot that can learn how to
synchronize sounds and words with appropriate
facial expressions. After testing various
parameters, I concluded that functions that
return higher objective scores, take a longer
time to complete. Some applications of this
project include translating text into lip motion
for animation movies and humanoid robots. The
next step in this project would be to try
different parameters such as convergence and
migrating populations. I could also develop body
language as well as lip motion.
46Applications
- With a program using genetic algorithms, matching
lip movements to speech are language independent.
Also, one can use the same program for different
people. In the traditional style, the tables
would have to be recoded because everyone has
individual accents, body language, and how fast
they talk. - This program can be used to match text and lip
motion for movie animation and humanoid robots. - Animation industries dont have to hand draw lip
motion or use a databank of words. This would be
most affective if I used a combination of
pre-programmed lipcodes and user inputs. - This could be used to convert sounds into lip
motion so deaf people can understand what is
being said in situations in which they cant see
the person who is speaking. I - t could also be used in reverse and convert lip
motion into text. This could be useful in
documenting presentations, speeches, and even
court cases. It could also be used to create
subtitles in movies.
47Representing Fuzzy Values on Bloch Sphere
- Show L1 through L5 options
48Synchronizing Lips with Speech
Want This
Not This
48