Title: Getting Cyc-ed about Inference
1Getting Cyc-ed about Inference
2What is Cyc?
- The Worlds Leading Provider of Formalized
Common Sense - (currently 200,000 terms each with several
assertions over 1,000,000 rules )
3What is Cyc?
- Founded in 1984 by Stanford professor Doug Lenat,
it was a project in the MCC (Microelectronics and
Computer Technology Corporation) until 1994, when
Lenat left to form Cycorp - Objective Codify the millions of pieces of
knowledge that comprise common sense - When people die, they stop buying things
- Kerosene flows downhill
- When a bowl is overturned, its contents fall out.
4Common Sense
- Cycs stated goal
- Break the software brittleness bottleneck
once and for all by constructing a foundation of
basic common-sense knowledge-a semantic
substratum of terms, rules and relations, a deep
layer of understanding that can be used by other
programs to make them more flexible. - Basic Common-Sense Knowledge
- "In modern America, this encompasses recent
history and current affairs, everyday physics,
household chemistry, famous books and movies and
songs and ads, famous people, nutrition,
addition, weather, etc
5Overview
- What is Cyc
- OpenCyc, ResearchCyc, Full Cyc
- Whats in Cyc?
- The Big Picture
- Microtheories
- Predicates and Functions
- Arguments and Types
- Lexicon
- How do I use it?
- Cyc at Stanford
- Cyc Browser
- Java and Applications
Several examples and images come from the more
extensive, online OpenCyc Tutorial www.cyc.com/doc
/tut
6Whats in Cyc?
- A Knowledge Base (KB) consisting of terms
- Dog, DogFood, Doghouse, SnoopDoggyDogg
- Assertions that relate these terms.
- Ground Assertions
- (isa MyDogSharkey BelgianSheepdog)
- (genls BelgianSheepdog Dog)
- Rules, which derive assertions from Ground
Assertions - (isa THING COL )
- (genls COL SUPERCOL) ---gt
- (isa THING SUPERCOL)
7The Knowledge Base
Upper Ontology Abstract Concepts
EVENT ? TEMPORAL-THING ? INDIVIDUAL ? THING
Core Theories Space, Time, Causality,
Knowledge Base Layers
For all events a and b, a causes b implies a
precedes b
Domain-Specific Theories
For any mammal m and any anthrax bacteria a, ms
being exposed to a causes m to be infected by a.
Facts Instances
John is a person infected by anthrax.
8A Dog is a ..
- Agent Agent-Generic AirBreathingVertabrate
Animal Agent Agent-Generic AirBreathingVertabrate
Animal AnimalBLOBilateralObject
BiologicalLivingObect CanineAnimal
CarnivoreCarnivoreOrder ChordataPhylum Coelmates
Container-Underspecified Dog EukaryoticOrganism
Eutheria FrontAndBackSidedObject Heterotroph
HexelateralObjectHomeotherm HumanScaleObject
Individual IndividualAgentLeftAndRightSidedObject
Location-Underspecified MammalNaturalTangibleStuff
NonPersonAnimal OrganicStuff Organism-Whole
PartiallyTangible PerceptualAgent
Region-UnderspecifiedSentientAnimal
SolidTangibleThing SomethingExistingSpatialThing
SpatialThing-Localized System-GenericTemporalThing
TerrestrialOrganism ThingTopAndBottomSidedObject
Trajector-Underspecified VertebrateAnimalBLOBila
teralObject BiologicalLivingObect CanineAnimal
CarnivoreCarnivoreOrder ChordataPhylum Coelmates
Container-Underspecified Dog EukaryoticOrganism
EutheriaFrontAndBackSidedObject Heterotroph
HexelateralObjectHomeotherm HumanScaleObject
Individual IndividualAgentLeftAndRightSidedObject
Location-Underspecified MammalNaturalTangibleStuff
NonPersonAnimal OrganicStuff Organism-Whole
PartiallyTangible PerceptualAgent
Region-UnderspecifiedSentientAnimal
SolidTangibleThing SomethingExistingSpatialThing
SpatialThing-Localized System-GenericTemporalThing
TerrestrialOrganism ThingTopAndBottomSidedObject
Trajector-Underspecified Vertebrate
9Microtheories
- A way of grouping assertions and rules which
share a set of assumptions about a domain, level
of detail, period in time, source, topic, etc. - Each KB assertion occurs within some microtheory
- These allow for a KB that copes with global
inconsistency and that can focus inference
according to necessary detail
10Microtheories
- Though no monotonic contradictions are allowed
inside a microtheory, assertions in different
microtheries may be inconsistent - Time
- MT1 Mandela is an elder statesman
- MT2 Mandela is the President of South Africa
- MT3 Mandela is a political prisoner
- Granularity/domain
- MT1 Tables are solid
- MT2 Tables are mostly space
- Microtheories are arranged in an inheritance
heirarchy -
11Microtheory Inheritance genlMt
BaseKB
genlMt
genlMt
NaiveSpatialMt
MovementMt
genlMt
genlMt
genlMt
NaivePhysicsMt
NaturalGeographyMt
genlMt
TransportationMt
12Predicates and Denotational Functions
- Predicates are truth-functional relations which
can be evaluated according to facts in the KB and
used to make sentences that are true or false - Usually Lowercase
- (objectHasColor BrownDog Brown)
- (memberStatusOfOrganization Norway NATO
FoundingMember) - Functions take arguments to denote Non-Atomic
Terms (NATs), expressions that represent things - Usually Uppercase
- (FruitFn AppleTree) denotes an apple
- (BorderBetweenFn Sweden Norway) denotes the
border between Sweden and Norway. -
13Arity and Argument Types
- Every predicate or function is defined with
particular arity and argument types - Arity Number of Arguments
- (arity mother 2) (arity MotherFn 1)
- Argument Types use isa and genl relations
- (arg1Isa mother Animal)
- (arg2Isa mother FemaleAnimal)
- (arg1Isa TransportViaFn ExistingObjectType)
- (arg1Genl treatmentTypeAppliedToConditionType
- MedicalTreatmentEvent)
14Predicates and Rules
- Can be built to form meaningful, well-formed
logical sentences - You can add your own, using ASSERT
Mt AgentGMt Rule (implies (and
(isa ?HELP HelpingAnAgent)
(performedBy ?HELP ?HELPER)
(beneficiary ?HELP ?HELPED)
(positiveVestedInterest ?HELPER ?HELPED)
15Specialized Content
- Cyc has several specialized and useful areas of
KB content - Times and Dates temporallyIntersects,startsAfterSt
artingOf,YearsDuration - Spacial Properties and Relations
- constituent, ingredient, 60 in predicates,
- 60 Shape Attributes
- Event Types, with Roles and Actors
- MovementEvent, MedicalTreatmentEvent,
GivingSomething -
16The Cyc Lexicon
- Cyc also knows a lot about English
- There are entries for Lexical items as well
- Treat-TheWord Use-TheWord
- Several predicates express relationships which
translate English expressions into CycL (and vice
versa)
(verbSemTrans Use-TheWord 0 TransitiveNPFrame
(and (isa ACTION UsingAnObject)
(performedBy ACTION SUBJECT)
(instrument-Generic ACTION OBJECT)))
17Important Lexical Predicates
- denotation -- Relates a LexicalWord and
SpeechPart to some denotedThing (e.g. some
Individual or Collection). - multiWordString -- Relates a list of strings
(e.g. ("hot")), a LexicalWord (e.g. Dog-TheWord),
and a SpeechPart to some denoted Thing (e.g.
HotDog) c.f. MultiWord -PhrasePrediciate. - verbSemTrans -- Relates a LexicalWord, sense
number, and SubcategorizationFrame to a
NLTemplateExpression c.f. SemTransPredicate. - nameString -- Relates a Thing to a string which
(conventionally) refers to it - Well do some examples
18The Cyc Browser
- To run the Cyc KB Browser
- Run an image on a ja- machine.
- Move to /scr/nlp/src/cyc/cyc1.0enterprise/
- Run ./run-cyc.sh , a Cyc will start to run on
your desktop. - You can use the SubL interactor directly at the
prompt - Or you can load up a browser from the ja- machine
(youll need to forward the desktop image to your
machine) and set the address to - http//localhost3602/cgi-bin/cyccgi/cg?cb-start
19Exploring Cyc
- http//researchcyc.cyc.com/
- Playing around with the Browser is only way to
really learn whats in Cyc. - Logging In
- The Search Box
- The Heirarchy Browser
- Documentation (usr/pass rcyc/rcyc)
- Ask
- Assert
- Query
- Toolbar
- Dont use the parser
20Example Application Cyc in RTE
- Were looking at using Cyc the context of
Recognizing Textual Entailment - Dependency parses are a good starting point for
Cyc
(PID 702, Hypothesis) In the late 1980s Budapest
became the center of the reform movement.
21RTE in a Nutshell
bought
object
subj
Synonym Match Cost 0.2
Chris (person)
car
Exact Match Cost 0.0
Hypernym Match Cost 0.4
purchased
object
subj
BMW
Chris (person)
Vertex Cost (0.0 0.2 0.4)/3 0.2 Relation
Cost 0 (Graphs Isomorphic) Match Cost
0.55 (0.2) (.45) 0.0 0.11
22Cyc and Java
- We clearly need a way to interact with the Cyc KB
programatically - Cyc APIs exist for Java and Python
- (check out /src/nlp/src/cyc/api/java/OpenCyc.jar)
- Documentation is sparse
- Cyc could be really valuable, if we can figure
out a way to get around whats missing - Ive got code (soon to be in JavaNLP) for generic
interactions with the CycKB, and for searching
Cyc space along genls relationships as a measure
of verb similarity - Its a huge KB, so use your imagination
23 Thank You