Title: Developing Archaeological Informatics: A Proposed Agenda for the Next Five Years
1Developing Archaeological Informatics A Proposed
Agenda for the Next Five Years
- Lohse E S, Schou C, Strickland A, Sammons D,
Strickland J, Schlader RIdaho State University,
USA
2Our Information Age
- All information is incomplete.
- Information widens the range of choices, not
narrow it. - Information is always subject to multiple
interpretations and constructions.
3Data Information Knowledge
- Data
- Raw facts and observations
- Information
- Data given meaning and context
- Organized
- Knowledge
- Information you have internalized
4Information Production
1,000,000 terabytes 90 digital Growing 50 yearly
Gray and Szalay 2003
5Storing All that Information
- Schematized storage (metadata) can help
organization and research - Schematized XML data sets are a universal way to
exchange data - Data are objects, and so, need standard
representation for classes and methods
6How to Keep Up
- Bring the analysis to the data
- Embed analyses in data systems
- Build smart data systems
- Build image-based data systems
7Analysis of Databases
- Create uniform samples
- Filter data
- Assemble subsets
- Estimate completeness
- Censor bad data
- Count and build histograms
- Generate Monte Carlo subsets
- Perform likelihood calculations
- Test hypotheses
- These tasks are best done inside databases
(bring Mohamed to the mountain) -
8Go for Smart Data
- Too much data to move around, so take analysis to
the data - Do all data manipulations inside the database
(build custom procedures and functions in the
database) - Guaranteed automatic parallelism
- Easy to build custom functionality key (pixel
processing, temporal and spatial indexing,
unified databases and procedures) - Easy to reorganize data (multiple views make
optimal analyses) - Scalable to Petabyte data sets
9Data Mining Images
We can discover new types of phenomena using
automated pattern recognition multiscale analyses
10Disseminating Datawill also change
- expectations and standards must change
- there will be exponential growth
- projects must become more responsible
11Archaeological Informatics
- Technical, Social Aspects of Information
Technology -
12Archaeological IT
- Quantitative methods
- Statistics and classification
- Archaeometry
- Visualization (imaging, CAD, multimedia and
virtual reality) - Expert systems
- Artificial intelligence
- GIS
- All require
- Digital archives
- Databases
13Access to Data
- Primary issue today
- Data are unsorted and unavailable
- Number of existing data are huge and daunting
- Technological fixes are available
- but implementation is a social problem
14Information Science
- Scientists want open access to data and
information - Use of electronic media to enhance scientific
communication is a huge shift in the conduct of
basic science - Potential for cross-disciplinary and
international collaborations is booming - Needs include building adequate metadata,
accessing migrating data, and controlling access
to information
15Whats Out There?
- Social and political agendas
- Competing proprietary interests
- Competing to control dissemination of the digital
archive
16Current Risks
- Open environment will not continue
- Not everyone will catch on to using e-media
structures - Not all current e-media initiatives are
altruistic or problem-solving without
self-interest
17Practical Problems
- Scientists and policy-makers do not have accepted
theory for shaping IT - Producers and users work within context-free
models - Lots of prototypes and inititiatives with high
promise and withered funding
18Practical Problems
- RESULT
- wasted funding, and orphaned data left in
marginal, decaying, dead systems and formats
19Assumptions for future
- E-media is better than traditional media
- E-communication will be less expensive
- Access to e-media will be easier and wider
- Systematic use of e-media will dramatically speed
up scientific communication
20Social Implications
- Electronic Data
- Electronic access to primary data
- High speed of sharing
- Target audiences will be selected
- Professional status based on quality of data
design and data sharing
- Traditional Print
- Print access to secondary information
- Slow speed publishing
- Broad dissemination
- Professional status based on quantity and venue
of publication
21Market Forces
- Open Access (transparent)
- Closed Access (opaque)
- Which will dominate?
22Liberating Archaeological Data
- Many archaeologists, working under Federal and
State mandates, remain outside any long term
concern with data handling - Data liberation runs afoul of insistence on
fossilized traditional research practice, fueled
by resource management contracts - Internet impacts organization of archaeological
knowledge, with a shift from hierarchical
structures to network flows (Hodder 1998)
23Need for Re-thinking
- Archaeological classification practices will need
to emphasize optimal structures for organization
of archaeological data in an electronic
environment - Interpretive structures must admit variable ways
of grouping data - Higher order groupings (typologies) will have to
be supplemented by alternative analytical
groupings (material classes, deposition classes) - Data structures will have to be flexible and
analytical
24New Structures Must Recover Links
- Traditional databases (TDs) have disparate or
unlinked compendiums (fields with specimen
measurements but no link to grey literature
reports) - TDs typically are arranged to follow a rigid
linear structure based on chronological groupings
dictated by field recovery records and publishing - This produces intractable data sets, where
important data remain unavailable because
reclamation costs are so high, there is a lack of
integration for specialist data to be linked with
overall data structure, and little potential for
futrue synthesis
25A Theory of Information
- Proviso we cannot enter new data as old
structures into new IT (HTML, interrelational
databases, and GIS) and expect working databases - Current data systems are outdated, evolved
through expediency, not grounded in theory - The theory-driven structure of the data must be
revived
26e.g., Metadata
- Data about data, providing information essential
to data use and reuse - Can refer to agreed upon sets of fields and
associated lexicons - Can consist of detailed descriptions of
measurement systems and rules for their
application - Data users need metadata to make intelligent
decision in selecting, using, adding to, or
translating databases
27Current Standards for Metatdata
- MARC, Machine Readable Catalog, library
cataloging - Text Encoding Initiative (TEI), standard
descriptions of machine readable text - Directory Interchange Format (DIF), metadata for
satellite imagery - U.S. National Spatial Data Infrastructure (NSDI),
complex descriptions of spatial data - And of course, the Dublin Core
28Metadata and Databasesin the future
- Will improve access to data
- will facilitate sharing and interoperability
- will characterize and index data
- Will operate under the principles described
above - Analysis embedded in data
- Smart data systems
- Image-based data systems
29Measures for Data Qualityfor the information of
the future
- Adequate description and meaning
- Specification of intended use and range of
purposes and constraints - Requirements for access and use
- Description and rationale for structure and
design - Global relationships to other databases
- Updated cycle information
30Data Models
- Data are a model of the real world
- The description is arbitrary and biased
- Data models incorporate different data views
- Key issues verification, validation and
certification of data quality - Measures objective correctness (accuracy and
consistency) and appropriateness defined by
intended purpose - Required elements all data must be augmented
with metadata to record information needed to
assess data quality, record results of
assessments, and support process control
31Problems Data Deterioration
- Limited media life
- Rapid obsolescence of software and hardware
- Use of graphics, hypertext and linked structures
only accelerates decay rates - Data files will become increasingly dependent on
specific software for continued interpretation - Record keeping paradigms are essential
(compression is not an option annotated metadata
must remain transparent)
32Preparing for the Future
- Archaeological data and information are growing
exponentially - New paradigms of data access and manipulation
will be created - Effects on theory and method will be extreme
- Effects on the culture of the discipline will
prompt profound dislocations
33Preparing for the Future
- Not just more data and faster access
- Qualitative differences in
- data gathering methods
- social relationships between/among data, users,
creators, and managers - the disciplines expectations for publication
and research - AND
34Preparing for the Future
- methodological and theoretical paradigm changes
driven by technological innovations and social
interactions with the technology