Title: Towards the Assessment of the EU Data Retention Directive Towards the Assessment of the EU Data Rete
1Towards the Assessment of the EU Data Retention
DirectiveTowards the Assessment of the EU Data
Retention Directive ? Brussels ? May 14th 2009
- Martin Willcox
- EMEA Director Platform Solutions Marketing
- Draft 2
2Agenda
- Introduction to Teradata
- Approaches and challenges to satisfying the Data
Retention Directive - Practical experience and lessons learned from
European Data Retention projects - Summary conclusions
3Who are we?Towards the Assessment of the EU
Data Retention Directive ? Brussels ? May 14th
2009
- Martin Willcox
- EMEA Director Platform Solutions Marketing
4Teradata who are we?
- Started life in a garage in California in 1979
- Now have global reach and annual revenues of
1.7B - Core product offer is a high-performance parallel
database platform that, by common consent
(Gartner, Forrester, etc.), is the most
sophisticated and scalable analytical platform in
the industry - Close to 1,000 of the leading global 3,000
organizations including 9 of the top 10 global
Telcos - trust us to manage their most important
data and to provide the database processing
required to optimize their analytical business
processes.
Chances are better than even that when you last
bought groceries, last bought an airline ticket,
last took delivery of something that you bought
on-line, last withdrew cash from an ATM or last
made a call on your cell phone, this fact was
recorded somewhere in a Teradata database.
5Approaches and challenges to satisfying the EU
Data Retention DirectiveTowards the Assessment
of the EU Data Retention Directive ? Brussels ?
May 14th 2009
- Martin Willcox
- EMEA Director Platform Solutions Marketing
6Law Enforcement and Electronic Communications Data
- Law Enforcement Agencies continue to use
electronic communications data to fight crime,
serious fraud and terrorism - The use of telephony data by the law enforcement
agencies of member states for this purpose is
well-established and pre-dates the introduction
of the Directive - The Directive has acted as catalyst in codifying
these arrangements in member states, e.g. the UK
formerly had a voluntary code - Electronic communications data now includes
traditional telephony data and Internet
Protocol data - The General Public in Member States often does
not appreciate the difference between Data
Retention and Lawful Intercept and this has
contributed to privacy concerns.
7Challenges with the retention of circuit-switched
telephony data
- Most CSPs retain CDR data in a data warehouse
to support business intelligence applications,
for example - Fraud prevention and revenue management
- Customer value analysis and behavioral
segmentation - Network management and capacity planning etc.,
etc. - For mobile and fixed-line telephony services,
Data Retention directive requirements can be
satisfied simply by retaining detailed data in
these analytical databases for longer - Chief challenge for many organizations is the
cost associated with retaining this data for
longer than business requirements would otherwise
dictate - This issue drives some CSPs to move older data to
on- and off-line archival systems, with potential
negative consequences for access.
Note that some organizations have elected to
satisfy Data Retention requirements through the
construction of separate systems.
8Challenges with the retention of IP data
- CSPs do not commonly retain detailed IP traffic
data - - Privacy concerns
- Business use-cases not yet well-developed.
- Data volumes are large and are increasing rapidly
- Capturing and retaining this data represents a
significant cost to IP CSPs - IP traffic volumes appear to us to be increasing
by between 1.2x and 2x per annum. - There appears to be some ambiguity about which
organizations should bear responsibility for
implementation in some cases, e.g. - CSP versus social-networking site
- CSP versus VoIP service provider.
9The competent agency perspective
Call, operator A
SMS, operator B
ISPs C D to connect to social networking site
Call, operator E
MMS, operator F
10Challenges the competent agency perspective
- the main challenges were found not in retrieval
of data from the handset, but in the analysis
of the records of calls made provided by the
telephone companies - Source Home Affairs Committee Report on
Terrorism Detention Powers References to
Telephony Evidence
- Why is this hard?
- Data is distributed across multiple CSP
repositories and must be re-combined by the
agency - No common information or data exchange model to
resolve semantic / technical inconsistencies - Variable data quality
- Ambiguity / variability in CSP service levels.
These are data preparation issues only once
these are addressed can meaningful analysis begin.
11Challenges the competent agency perspective
continued
- 210 billion
- Source Radicati Group
- The estimated number of e-mails sent per day in
2008 ( 70 of this traffic is spam /
malware-driven) - The construction of national / supra-national
databases capable of sustaining required loading
rates would require very large investment - Despite the disadvantages inherent in
distributing electronic communications data
across multiple repositories, economic and
privacy considerations may necessitate the
continuation of this approach in very many cases
for the time being.
The role of standards and governance should
assume enormous importance whilst cross-operator
analysis is dependant on the ad-hoc re-assembly
of multiple data-sets by LEAs.
12Practical experience from European Telco projects
lessons learnedTowards the Assessment of the
EU Data Retention Directive ? Brussels ? May 14th
2009
- Martin Willcox
- EMEA Director Platform Solutions Marketing
13Critical success factors
- Integrated logical data model that supports the
capture and exploitation of traditional
telephony data alongside IP data - High-performance end-to-end technology
infrastructure - Support loading of hundreds of millions of CDRs /
day in multiple, narrow mini-batch windows - Support querying of 4 weeks CDR data in seconds,
querying of 12 months CDR data in less than one
hour - Careful balance loading vs. access optimization.
- Scale out end-to-end technology infrastructure,
to support year-on-year growth in volumes, users. - Multi-temperature data management.
14Multi-temperature data management
Hot
Cool
Warm
- Heavily Accessed
- Operational Intelligence
- Shallow History
- Regulatory Compliance
- Trending Analysis
- Deep History
Current state-of-the-art is to use range
partitioning / other sophisticated indexing
strategies to ensure that queries that do not
have to scan the entire CDR table do not do so
and to use workload management to ensure that
queries that do need to access large volumes of
cool historical data do not consume all system
resources and so block other work. New
technology will permit the use of multiple
storage devices with different performance
characteristics in the same system and the
automated migration of data between these
devices, based on relative temperature.
15Summary conclusionsTowards the Assessment of
the EU Data Retention Directive ? Brussels ? May
14th 2009
- Martin Willcox
- EMEA Director Platform Solutions Marketing
16Summary conclusions
- Few technological barriers to the construction of
the databases required to support the Data
Retention directive for circuit-switched
telephony data - Issues are chiefly organizational and political
- One database or two? Can data be re-used?
- How long must data be retained for?
- How should CSPs be compensated for the costs that
they incur? - What are the service levels that CSPs should
maintain? - Many technology / database vendors have
introduced specific high volume product and
service offers (including this one!) - Lack of EU-wide standards to support integration
and reconciliation of data from multiple CSPs is
an issue that should be addressed, however.
17Summary conclusions continued
- IP data presents a greater challenge
- CSP vs. ASP issue
- Scalability / cost challenge
- Inherent anonymity of some IP-based
communication - Business value of collecting this data not yet
accepted by the majority of CSPs - Privacy concerns.
It should be noted that the business value of
loading billing CDR data to data warehouses was
initially disputed, but that almost all CSPs now
capture this data for analysis indeed, many CSPs
are now experimenting with the capture and
storage of low-level network data to enable them,
for example, to better understand the levels of
customer service they provide and to support near
real-time network configuration.