Data Mining - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

Data Mining

Description:

Corporate telephone directories are designed to facilitate contacting ... THE TRANSFER IS RISK FREE ON BOTH SIDES. ... BANKER'S NAME, TELEPHONE, ACCOUNT AND ... – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 13
Provided by: whi6
Category:

less

Transcript and Presenter's Notes

Title: Data Mining


1
Data Mining
The collection, organization, and storage of data.
The Good Old Days When the size of the data was
small and access was very localized, data mining
was the picture of simplicity.
Todays Reality Not so much...
2
Sample Data Mining Applications
  • Purchasing Patterns
  • Determining credit and loan eligibility
  • Targeted marketing (sales, catalogs, coupons)
  • Product placement within store, website, etc.
  • Medical Records
  • Identifying disease outbreaks
  • Calculating insurance rates
  • Marketing new pharmaceuticals
  • Airline Flights
  • Planning for peak dates and times
  • Pinpointing passengers posing security threats
  • Monitoring weather conditions

3
The Three Stages
Validation! Next, apply various mathematical
models to sample data until you can choose one
that fits the datas behavior and is thus
considered predictive. Nyuk, nyuk, nyuk!
Exploration! Clean the data by selecting
specific features to focus on. Otherwise, the
sheer volume of information may be too complex to
analyze, you knucklehead!
Deployment! Use the selected model on any new
data to predict future outcomes.
4
Exploiting The Data
Databases are often mined for information for
which they were not originally intended.
The Centers for Disease Control wishes to mine
insurance company records on disease incidents,
seriousness, patient background, etc., in order
to identify disease outbreaks.
Corporate telephone directories are designed to
facilitate contacting individuals, but can be
used by competitors to determine, for instance
department sizes and thus company direction.
Banks identify high-risk neighborhoods for home
loans, but such redlining frequently defines
minority neighborhoods and results in racial
discrimination.
5
Identifying The Anonymous
Latanya Sweeneys 2001 MIT study.
87 of the U.S. population can be uniquely
identified based on three pieces of information
  • 5-digit zip code
  • gender
  • date of birth

Using a free Massachusetts voter list and a 20
insurance company database for Cambridge, Sweeney
found
  • Only 6 people with the governors DOB
  • Only 3 of those were men
  • Only 1 of those had the govs zip code

So, for 20, she obtained the governors complete
medical history!
6
Privacy PreservationData Obfuscation
Hide protected data by modifying some of it.
Example U.S. Census Bureau Public Use Microdata
  • Summarize data by census block (min. 300
    people)
  • Use ranges of values instead of particular
    values
  • Eliminate sparse values (i.e., top/bottom
    coding)
  • Randomly swap values among similar individuals

7
Privacy PreservationSummarization
Make only innocuous data summaries available.
Example Statistical Queries
  • Users query protected data via statistical
    operators.
  • A groups total income doesnt reveal an
    individuals income
  • Problem Multiple Queries
  • Query 1 X Total Salary For All Company
    Employees
  • Query 2 Y Total Salary For All Employees
    Except Boss
  • So Salary of Boss Equals X - Y
  • The Trick Perturb the data and/or output by
    introducing noise without compromising the
    datas statistical integrity

8
Privacy PreservationData Separation
Allow only trusted parties to see the data.
Example Patient Medical Records
For instance, a medical study might need data
from various providers in order to correlate
complaints/procedures and unrelated drugs.
Insurance Company
Hospital
Correlation between, say, male performance-enhance
ment drugs and rheumatoid arthritis
Each provider must agree not to release a
patients data without the patients consent.
Pharmacy
Doctor
9
Identity Theft
The Identity Theft Assumption Deterrence Act
  • 1998 federal law
  • Federal crime when someone transfers or uses,
    without lawful authority, a means of
    identification of another person with the intent
    to commit, or to aid or abet, any unlawful
    activity..."
  • Means of identification name, SSN, credit card
    number, cellular telephone electronic serial
    number, etc.
  • Maximum penalty 15 years imprisonment, a fine,
    and forfeiture of any personal property used or
    intended to be used to commit the crime.

10
Phishing Expedition
Phishing is a high-tech scam that uses spam or
pop-up messages to deceive Web users into
disclosing credit card numbers, bank account
information, Social Security number, passwords,
or other sensitive information.
11
The Nigerian Scam
Claiming to be Nigerian officials,
businesspeople, or the surviving spouses of
former government honchos, con artists offer to
transfer millions of dollars to your bank account
for a small fee
  • LAGOS, NIGERIA.
  • ATTENTION THE PRESIDENT/CEO
  • DEAR SIR,
  • CONFIDENTIAL BUSINESS PROPOSAL
  • HAVING CONSULTED WITH MY COLLEAGUES AND BASED ON
    THE INFORMATION GATHERED FROM THE NIGERIAN
    CHAMBERS OF COMMERCE AND INDUSTRY, I HAVE THE
    PRIVILEGE TO REQUEST FOR YOUR ASSISTANCE TO
    TRANSFER THE SUM OF 47,500,000.00 (FORTY SEVEN
    MILLION, FIVE HUNDRED THOUSAND UNITED STATES
    DOLLARS) INTO YOUR ACCOUNTS. THE ABOVE SUM
    RESULTED FROM AN OVER-INVOICED CONTRACT, EXECUTED
    COMMISSIONED AND PAID FOR ABOUT FIVE YEARS (5)
    AGO BY A FOREIGN CONTRACTOR. THIS ACTION WAS
    HOWEVER INTENTIONAL AND SINCE THEN THE FUND HAS
    BEEN IN A SUSPENSE ACCOUNT AT THE CENTRAL BANK OF
    NIGERIA APEX BANK.
  • WE ARE NOW READY TO TRANSFER THE FUND OVERSEAS
    AND THAT IS WHERE YOU COME IN. IT IS IMPORTANT TO
    INFORM YOU THAT AS CIVIL SERVANTS, WE ARE
    FORBIDDEN TO OPERATE A FOREIGN ACCOUNT THAT IS
    WHY WE REQUIRE YOUR ASSISTANCE. THE TOTAL SUM
    WILL BE SHARED AS FOLLOWS 70 FOR US, 25 FOR
    YOU AND 5 FOR LOCAL AND INTERNATIONAL EXPENSES
    INCIDENT TO THE TRANSFER.
  • THE TRANSFER IS RISK FREE ON BOTH SIDES. I AM AN
    ACCOUNTANT WITH THE NIGERIAN NATIONAL PETROLEUM
    CORPORATION (NNPC). IF YOU FIND THIS PROPOSAL
    ACCEPTABLE, WE SHALL REQUIRE THE FOLLOWING
    DOCUMENTS
  • YOUR BANKER'S NAME, TELEPHONE, ACCOUNT AND FAX
    NUMBERS.
  • YOUR PRIVATE TELEPHONE AND FAX NUMBERS -- FOR
    CONFIDENTIALITY AND EASY COMMUNICATION.
  • YOUR LETTER-HEADED PAPER STAMPED AND SIGNED.
  • ALTERNATIVELY WE WILL FURNISH YOU WITH THE TEXT
    OF WHAT TO TYPE INTO YOUR LETTER-HEADED PAPER,
    ALONG WITH A BREAKDOWN EXPLAINING,
    COMPREHENSIVELY WHAT WE REQUIRE OF YOU. THE
    BUSINESS WILL TAKE US THIRTY (30) WORKING DAYS TO
    ACCOMPLISH.
  • PLEASE REPLY URGENTLY.
  • BEST REGARDS

12
Identity Theft Victims By State
Per 100,000 Population, 1/1/2004-12/31/2004
Write a Comment
User Comments (0)
About PowerShow.com