eMail and Records Management with IBM Classification Module - PowerPoint PPT Presentation

About This Presentation
Title:

eMail and Records Management with IBM Classification Module

Description:

eMail and Records Management with IBM Classification Module Jon Dellaria, IBM Certified ECM Information Technology Specialist * Demo * Thank You IBM Records Manager ... – PowerPoint PPT presentation

Number of Views:192
Avg rating:3.0/5.0
Slides: 37
Provided by: JoshuaP153
Learn more at: https://firmcouncil.org
Category:

less

Transcript and Presenter's Notes

Title: eMail and Records Management with IBM Classification Module


1
eMail and Records Management with IBM
Classification Module
  • Jon Dellaria, IBM Certified ECM Information
    Technology Specialist

2
What is Classification?
Definition Class.i.fic.a.tion klas-uh-fi-key-shu
hn n the act of assigning an element (a
document for example) to a category.
3
IBM Leadership in Text Analysis and
Classification
  • IBM has a 50 year history in text analysis and
    discovery
  • As early as 1957, IBM published pioneer research
    done on text classification (and related topics,
    such as text search, and automatic creation of
    text abstracts)
  • IBM invests 50M annually in research and
    development for search and text analytics
  • 200 people actively engaged in RD
  • IBM holds over 200 patents in information access
    with more each year

4
Options for Implementing the Classification
Process
5
IBM Classification ModuleImplementing the
classification process in ECM more
  • Intelligent applications of policies via
    automatic, advanced classification
  • Combines the best automatic methods context
    sensitive and rule-based
  • Flexible automation levels accelerate adoption
    and acceptance
  • Incorporates user feedback in real-time to
    improve understanding
  • Integrated to IBM ECM architecture or use as a
    free-standing service
  • 12 languages and 3 more on the way!

ICM
6
Advanced Classification is Key to Compliant
Information Management
7
Advanced Classification The Facts
Implications
Facts
Humans provide, at best, marginally better
accuracy in executing classification, in
controlled tests
Compliance professionals hold the incorrect
assumption that humans are the best option for
piece by piece decision-making
1
1
Results of human-reliant filing are inconsistent
and inaccurate, resulting in effective accuracy
of 50, at best
Business users find forced manually
classification burdensome and at least 50 will
not participate
2
2
Every manual classification forced on your users
will cost your organization 17 cents in
productivity
Wide-spread adoption of archiving or records
management in your organization will lead to
large, measurable productivity loss
3
3
Deploying an archiving or records management
initiative is increasingly important, large scale
and difficult problem
Unstructured content makes up 80 of the volume
of information in the average enterprise and that
segment is growing 30 annually
4
4
8
Critical Dimensions of Classification
Automated
Manual
X
92
50 80
Accuracy
46
0.17
lt 0.01
Cost (per doc)
Consistency
100
lt50
Increasing Volume
9
Participation Impacts Accuracy
  • National Archives and Records Administration
    Study
  • Electronic Records Management initiative focused
    on user driven records declaration
  • 6 month study
  • 60 drop-off in participation in months after
    training
  • End users frequently outright refuse to
    categorize content

Participation in Manual Filing by Month
  • Manual classification and an emphasis on user
    training is outdated, providing inconsistent and
    inaccurate results

Inconsistent participation from humans is the
critical factorin evaluating different
classification methods
10
Manual Classification
With paper
With rudimentary electronics
Todays advanced electronics
11
Rules-based Classification
To Bob Smith ltBob.Smith_at_hotmail.comgt From Bill
Roker ltbroker_at_financialadv.comgt Subject Market
Movement Bob, Hope youre doing well. Ive got
a sure thing going with the stock we spoke about
on the phone. I think its time to pull the
trigger for my client. The clients name is John
Doe. His social is 123-45-6789. Hes totally on
board and hes excited to take advantage of this
new offer. Talk to you tomorrow, Bill Bill
Roker 212-555-1234 Financial Advisors, Inc.
Simple Rules Does the body contains the phrase
sure thing? Did the CFO send the email?
Complex Policies Does the body contains the
phrase sure thing in the same sentence as
stock"? Did the sender belongs to the broker
email group and send an email externally using
the phrase sure thing in the body?
Metadata extraction Does the body of the email
have anything that matches the pattern
XXX-YY-ZZZZ?
12
Rule-based Classifications Achilles HeelRule
Maintenance, Accuracy and Cost
Accuracy
Changes in business
Effort to adjust rules to new environment
Time
13
Context Sensitive Classification
Category 1
Category 2
Statistic-BasedCategorization
Category 3
Unclassified text
14
Context Sensitive Classification
Simple rules or keyword based analysis can be too
coarse to make fine distinctions between
long-form texts with very different intent
15
Choosing the Right Classification Method
  • Combined approaches provide the maximum accuracy
    from automation, at a slight productivity cost
  • Automated methods slash the costs
  • Manual methods have high costs associated to them
  • Manual methods suffer from lack of participation,
    hampering their overall viability

Accuracy
Consistent Participation Enforcement
Multiple Methods
High
Context Based Classification
Complex Policies
Rules Based Classification
Simple Rules
Authoring Templates
Manual Classification
Cost Savings Productivity
Low
High
Low
16
Enterprise Compliance VisionIntegrated Agile ECM
Platform for Compliant Information Management
IBM ECM
Content Collection
17
Reclassification Records Management
18
US Army Email and Records Manager Pilot
  • GOAL
  • Provide a means to address Armys requirement for
    the successful records management of email
  • Challenges faced
  • Lack of records management follow through from
    end users
  • Need to capture records and transactional
    activities from email
  • Need to capture records without user intervention

18
19
US Army Email and Records Manager Pilot
  • Success Criteria for pilot
  • Correctly capture and retrieve email provided
  • Ensure information is secure
  • Determine email can be accurately Auto
    Categorized by the IBM Categorization Module
    (ICM)
  • Goal of 90 or better accuracy
  • Show how ICM learns and improves accuracy over
    time
  • Place categorized record emails under correct
    Army records disposition

19
20
Army Email Pilot Concept of Operations (CONOPS)
21
Concept of Operations
Tasks Phase I Phase II Phase III
Identification of Records Categories ü    
Delivery of .pst files ü ü ü
Organization of .pst files to build knowledge base ü    
Ingesting of Emails Build Corpus ü
Ingesting of Emails - Auto Cat Runs ü ü ü
Auditing ü ü ü
complete
complete
complete
21
22
Pilot Phases
  • Pre-Phase Activity
  • Teach the system by building the knowledge base
    (Corpus)
  • Phase I
  • Process the first run of sample .pst files
  • Review and Audit the results
  • Phase II (30 days later)
  • Process the second run of sample .pst files
  • Review and Audit the results
  • Phase III (30 days later)
  • Process the third run of sample .pst files
  • Review and Audit the results

23
Knowledge Base (Corpus) Training
PST Inboxes
Organized Email
User 1 Email
Record Category Marketing
User 2 Email
Record Category Legal
Army Records Managers
Record Category Finance
. . .
. . .
Record Category RD
User n Email
24
Outlook Configuration
25
Building the Knowledge Base for Email
Categorization
26
Reports
27
Training Knowledge Base - The Results
Adjusted Data
Raw Data
28
Pilot Project Pre-Phase Activities
  • Build Categorization Knowledge Base
  • Work with Army Records Managers to define the
    most appropriate records categories and identify
    example mails for them
  • Goal
  • Find examples of email records for each of the
    record categories
  • Find 15 20 examples for each category
  • Results
  • 54 records categories were identified as being
    associated with the assigned offices
  • 28 categories have 15 or more examples
  • 26 categories have 14 or less examples

28
29
Army Email Pilot Phase I III Auto
Categorization Steps
IBMP8 eMail Manager
.PST Files
IBMCategorization Module
P8 InBox Folder
Review Audit
1 Army Records Manager
30
Pilot Project Phase I III Activities
  • First Pass of Categorization (process .pst files)
  • Take the Knowledgebase created by Army Records
    Managers and apply it to the bulk of email
  • Measure categorization results returned and begin
    Audit and Review process
  • Audit and Review process
  • Audit Used to confirm the accuracy of
    categorization via a random sampling of
    categorized results. If necessary, the chosen
    category may be modified which serves to retrain
    the knowledgebase for the future
  • Review items that do not meet the defined
    thresholds for categorization are available for
    further analysis and categorization by records
    personnel
  • The result of Audit and Review is improved the
    accuracy of the knowledgebase therefore improved
    categorization for future email ingest
  • Post Audit/Review reprocessing of email to
    measure categorization improvements
  • Measure results for the completion of each Phase

30
31
Pilot Project Activities
  • Focus on email from 16 different offices across
    Army
  • Demonstrate ability to categorize emails across
    Army enterprise
  • PST files from 398 pre-selected users
  • 581,634 emails in total in Phase I
  • 581,256 emails in total in Phase II
  • 735,333 emails in total in Phase III
  • 1,898,232 total emails through Phase III
  • PST files transferred to the pilot system via
    secure connection

31
32
Phase I Categorization Results
First Pass
Post Audit/Review
Total Categorized 84.5 98.8
Total Not Categorized 15.5 1.2
Phase II Categorization Results
First Pass
Post Audit/Review
Total Categorized 99.01 99.9
Total Not Categorized .9 .1
Phase III Categorization Results
First Pass
Post Audit/Review
Total Categorized 98.4 99.9
Total Not Categorized 1.6 .1
32
33
Army Records Manager Observations
  • As a records manager with a 25-year background in
    federal and civilian records management, I
    believe the automatic categorization of
    information is the next logical evolution in
    managing the records of an organization.
  • The classifier correctly identifies categories of
    records based on information from office file
    plans. Since office file plans are incorporated
    within an agency records manual, the initial
    input for the system is nominal. The office file
    plan becomes the document classifier.
  • Because the classifier retains information on
    document retrieval activity, it may be
    appropriate for use in many other information
    management program areas, including the Freedom
    of Information and Privacy Act.

34
Demo
34
35
Thank You
35
36
IBM Records Manager with Army File Plan
Write a Comment
User Comments (0)
About PowerShow.com