Title: Medical Informatics related disciplines
1Medical Informatics related disciplines
2(No Transcript)
3Four basic elements of a database system
User end user casual user
practitioners Data text graph image
sound vedio Software general
application Hardware CPU I/O
network Types of current database system
Hierarchical database Network database
Object-oriented database
Relational database (DBASE III plus Clipper
Foxpro Access) Advantages of integrated
database Data sharing Minimal
data redundancy Data consistency
Data standard improvement Data
integrity improvement Data security
Faster development of new application
4 First normal form (1NF) If and only if
1)all domains contain single values only
Second normal form (2NF) If and only if 1)
it is in 1NF, and 2) every non- key
attributes in fully dependent on the PK Third
normal form (3NF) If and only if 1)it is in
2NF, and 2) every non-key attributes is
non-transitively dependent on the PK Further
normalization Boyce-Codd normal form (BCNF)
Forth normal form (4NF) Fifth normal form
(5NF)
Normalization (best table design) Separate
the table until every fields depend on the
primary key only Primary key (PK) The group
of attributes that uniquely (no null, )
determines every other attribute in the
relation Candidate key All candidate PKs are
candidate keys Foreign key A attribute value
of reference table is the PK of the home
table
5??
??
??
??
???
??
??
X?
??
??
???
???
??
??
??
????
??
6??
??
??
??
???
??
??
X?
??
??
???
???
??
??
??
????
??
??
??
??
??
????
7??
????
??
??
??
??
???
??
??
????
X?
??
??????
??
???
???
?????
??
??
??
????
??
??
????
??
??
????
??
??
????
8 ?????? 1 ??????? ???? ?????? ??????
?????? ?????? 2 ?????? ?????? ??????
?????????? ??????
??????? ??????? ???????? ??????
3 ?????? 4 ?????? 5 ???????? 6
???? 7 ?????? 8 ???????? 9 X ?????? 10
?????????? 11 ?????????? 12 ???????????? 13
????????????? 14 ?????? 15 ?????? 16
???????? 17 ???????? 18 ?????? 19 ?????? 20
????
9??
????
??
??
??
??
???
??
??
????
X?
??
??????
??
???
???
?????
??
??
??
????
??
??
????
??
??
????
??
??
????
10Knowledge discovery in databases (KDD)
An explosive growth in capabilities to both
generate and collect data scientific data
(e.g. remote sensors, space satellites)
business data (e.g. bar codes for commercial
products, credit card) human genome
database (human genetic code) government
transactions (e.g. tax returns) health care
transactions (control costs, improve quality)
advanced data storage technology (faster,
higher capacity, cheaper) better database
management systems and data warehousing
technology Allow us to transform this data deluge
into mountains of stored data Such volumes of
data overwhelm traditional manual methods of data
analysis such as spreadsheets, ad-hoc
queries can create informative reports from
data can not analyze the contents of
reports to focus on important knowledge A
significant need exists for a new generation of
techniques and tools with the ability to
intelligently and automatically assist
humans in analyzing the mountains of data
for nuggets of useful knowledge These techniques
and tools are the subject of the emerging field
of KDD
11Knowledge Discovery andData Mining
Model representation Model evaluation Parameter
search model search
Components
Knowledge Discovery
Data preparation
Decision trees and rules Neural
networks Regression (linear and
non-linear) Classification and Clustering Probabil
istic graphical dependency models (Baysian
networks) Relational Learning models
(autoregressive models) etc.
Methods
Search for pattern
Data Mining
Knowledge evaluation
Knowledge interpretation
Applications
Business management Health care fraud detection
prevention Astronomy Molecular biology Global
climate change modeling Medicine etc.
12Definitions of KDD terms
Data is a set of facts (F e.g. cases in a
database) example F is the collection of 23
cases with three fields each containing the
values of debt, income,
loan. Pattern is an expression E in a
language describing facts in subset FE of F. E is
called a pattern. example If income ltt,
then person has defaulted on the loan .
Knowledge A pattern is called knowledge if
for some user specified threshold. It is purely
user-oriented, and determined by whatever
functions and thresholds the user chooses. By
appropriate settings of thresholds, one can
emphasize accurate predictors or useful patterns
over others. Data mining is a step in the
KDD process consisting of particular data mining
algorithms that, under some acceptable
computational efficiency limitations, produces a
particular enumeration of patterns over
F. KDD Process involves data preparation,
search for patterns, knowledge evaluation, and
refinement involving iteration after
modification. is the process of using data
mining methods ( algorithms) to extract
(identify) what is deemed knowledge
according to the specifications of measures and
thresholds, using the database F. is
interactive and iterative, involving numerous
steps with many decisions being made by the
user.
13Knowledge discovery in databases vs. data mining
(KDD vs. DM)
KDD has been mostly used by artificial
intelligence, machine learning researchers
the overall process of discovering useful
knowledge from data DM has been commonly used
by statisticians, data analysts, MIS community
application of algorithms for extracting
patterns from data without the
additional steps of the KDD process (such as
incorporating appropriate prior
knowledge and proper interpretation of the
results) blind application of DM
methods can be a dangerous activity Links between
KDD (and DM) and related field machine
learning pattern recognition
databases statistics
artificial intelligence knowledge
acquisition for expert systems data
visualization
14The National Health Insurance Information
System Nine major systems Underwriting of
insurance Payments for medical fees Management
of medical affairs Financial management
Administrative support Decision support system
Exchange of information Management of
community-based insuring agencies Safety control
15(No Transcript)
16 Medical Informatics as a Discipline
(http//www.cpmc.columbia.edu/edu/textbook) medi
cal informatics study and use of computers
and information in health care purpose of
this lecture is to further define the
field definition by Asso. of American Medical
Colleges (AAMC) 1986 "Medical informatics is
a developing body of knowledge and a set of
techniques concerning the organizational
management of information in support of
medical research, education, and patient
care.... Medical informatics combines medical
science with several technologies and
disciplines in the information and computer
sciences and provides methodologies by which
these can contribute to better use of the
medical knowledge base and ultimately to better
medical care."
17history of computers 1800s - Charles
Babbage's logic engine 1890 - Herman
Holleriths's punch cards for census 1940s -
early programmable digital computers (Eniac)
1950s - commercially available (Univac)
1960s - faster, more memory 1970s -
minicomputers 1980s - microcomputers,
networks 1990s - RISC, workstations, growth
of networks appearance of computers in
medicine 1960s - practical early
departmental and monolithic
research early ECG and diagnosis 1970s -
practical monolithic administrative
departmental, imaging (CT),
early bibliographic retrieval
research alerts, Mycin (early successes)
1980s - practical results reporting,
outpatient, growth of clinical
systems and databases research
AI, IR 1990s - practical integration,
communication research vocab,
interfaces, coding, evaluation
18factors in lack of use of computers in clinical
care involves complex organisms (unlike
physical processes) if over-simplify,
not useful (vs bank transaction)
therefore need sophisticated abstraction
detail technology for gathering complex
info. just emerging eg low use of QMR
or DXplain therefore providers have not
entered info. reimbursement has not been
linked to clinical info. therefore many
admin. systems but few clinical health care
administered by individuals, small groups
less need for coordination inertia
fear ignorance money security,
integrity lack of standards language
previous failures rapid turnover of
technology
19factors in recent increase of medical
informatics increase in use of technology -
more data generated mobility of population -
need to communicate specialization - need to
communicate managed care systems - need to
communicate rise in health care costs -
attempt to control care improved hardware -
faster and more memory improved methods -
acquisition, transfer, retrieval reduced
computer costs increased awareness
20related fields biomedical engineering - ECG,
devices MI higher level of
abstraction electrical engineering -
hardware computer science - algorithms, closer to
mathematics MI specific to health
domain medical computer science
subdivision of comp sci cognitive science - AI
and psychology not concerned with
studying human brain information theory - physics
of communication information (library) science
manage aper/elec info MI is
close to this but MI not limited to
info. storage and retrieval software industry -
producing products MI stresses
evaulation MI not dependent on selling
every roduct
makeup of med- info groups MDs, RNs, dentists,
other health care workers PhDs, esp computer
science (also physics, ...) administrators,
policy planners masters, PhD programs in medical
informatics industry
21current issues in clinical care cost
accessibility of health care coordinating
care and setting policy acquisition and
retrieval of data (eg across inst.)
acquisition and sharing of knowledge (eg
specialist) medical informatics research
mirrors clinical issues data acquisition -
GUI, nlp data storage - databases, modeling
vocabularies - format, content
organization of data - Larry Weed POMR 1969
machine interfaces - standards like HL7,
security data retrieval - query languages
knowledge acquisition knowledge
representation - Arden application of
knowledge when needed - decision
analysis, alerts, diagnosis education
care plans and practice guidelines