Highlights from EPC 2006 - PowerPoint PPT Presentation

About This Presentation
Title:

Highlights from EPC 2006

Description:

4 parallel sessions (in bld 40) All synchronized. 5 minutes pause between talks. Easy ... No sys-admin, net-admin, embedded-software, office automation. Why: ... – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 51
Provided by: innocento
Category:

less

Transcript and Presenter's Notes

Title: Highlights from EPC 2006


1
Highlights from EPC 2006
  • Vincenzo Innocente
  • On behalf of the
  • Local Organizing Committee

2
EuroPython at CERN
  • EuroPython conference organized by SFT this year!
  • Three days
  • Parallel sessions in Bld 40
  • Keynotes and Lightning in main auditorium
  • Dinner in the Globe
  • 280 participants
  • 100 presentations (w/o lightning)
  • 5 by CERN

3
Schedule
  • 4 parallel sessions (in bld 40)
  • All synchronized
  • 5 minutes pause between talks
  • Easy for people to move from one session to
    another

Plenary Lightning key notes (in Main Amphi)
4
Scientific Program
  • 7 tracks
  • Python in Science
  • Python Language Libraries
  • Agile Development
  • Web Frameworks
  • Business and Applications
  • Teaching
  • Games and Entertainment

5
Community
  • Who
  • Wide age spectrum
  • Many in post-doc age-range
  • All 5 continents
  • Very few women (1-2, all managers?)
  • Where
  • Mostly Companies developing Software Solutions
  • Revenue from Selling custom products or services
  • Find business advantages
  • In using open source software (contribute to its
    development)
  • Develop components reusable beyond a specific
    project
  • Some Research Labs
  • Domain specific applications
  • Reuse in the community (adapting to pre-existing
    habits)

6
Community
  • What
  • Core language development
  • Web framework, web applications
  • Software development tools (web based)
  • Scientific data processing, visualization
  • No sys-admin, net-admin, embedded-software,
    office automation
  • Why
  • Hear news about Language, Libraries, key products
    (Zope,)
  • Discuss, propose, complain
  • Present their products
  • In many cases just a spin-off component
  • Work (in Sprint sessions)

7
Messages
  • Python
  • A language for rapid-prototyping,
    extreme-programming, just-in-time deployment
  • THE integration framework
  • THE Business Domain Language
  • THE embedded scripting language
  • Python is faster than Assembler

8
Outline
  • What I will not cover
  • Latest greatest features of Python
  • Python 3000
  • SciPy, PyTables, PyPy, Zope, Plone, Gjango,
  • Python in HEP
  • Google.
  • I will focus on
  • Python a framework for scientific application
  • Building and sharing components
  • Python from fast-prototyping to engineered code
  • Dispersed development

9
Scientific Frameworks
10
MGL ToolsIndependent and re-usable component for
structural bioinformatics
11
MGL ToolsIndependent and re-usable component for
structural bioinformatics
12
(No Transcript)
13
(No Transcript)
14
(No Transcript)
15
(No Transcript)
16
(No Transcript)
17
(No Transcript)
18
Pyphant
19
Pyphant application
20
Pyphant architecture
21
Worker Code
22
SciPy ETS
23
Building Sharing Components
24
(No Transcript)
25
Builds upon SciPy (data representation) And
HDF5 (I/0 layer)
26
(No Transcript)
27
The Company
28
The Customer
29
The new Components
  • For this customer they had to two additional
    requirements to fulfill
  • Avoid to blow the CMS with binary files
  • Count the number of accesses
  • They developed two lightweight products
  • Plug in the deployed solution
  • Reuse the existing infrastructure
  • Reusable outside this project and company
  • Extendable to other architecture/framework
  • Contribution to open source software

30
Tramline
  • Tramline plugs between Apache and Plone/ZOPE
  • On Upload
  • extract data to disk
  • Assign id
  • Store id in ZOPE
  • On download
  • Replace id with file content

31
Linktally
  • Scan logs
  • Count request
  • Store in the DB as Metadata
  • Rank content in CMS

32
LinkTally status prospects
Now Solution for one customer Limited spin-off
Evolution Contribution from community Spin-in
use it in other projects!
33
From a prototype to a product
34
The Indico Technology
  • Main programming language Python
  • Runs on Apache using the Python module mod_python
  • Persistence based in ZODB (Zope Object Database)
  • Transparency no need for explicit read/writes of
    the objects
  • Fits very well with Indico complex object model
  • Proven performance and scalability
  • Timetable generation libXML, libXSLt python
    bindings
  • Portable technologies runs on Windows, linux
  • Export gateways
  • iCalendar XML PDF outputs
  • OAI (Open Archive Initiatives) for ensuring
    integration with other services
  • Standard protocol for information exchange
    between digital libraries
  • Allows to expose conference data
  • Allows other systems to fetch conference data and
    build services over it
  • Simple mechanism ? XML over HTTP

35
The Invenio Technology
  • Main programming language Python
  • Runs on Apache using the Python module mod_python
  • Uses MySQL RDBMS
  • Take advantage of fully featured query language
  • Invenio home made Indexes
  • Internal representation with XML-MARC
  • Export gateways
  • Multiple output formats HTML, XML, MARC, OAI,
    DC, etc.
  • Some modules
  • Still in PHP (slowly moved to Python)
  • Some in Common Lisp (BibCheck)

36
Index Space Design (II)
  • Two important speed factors to consider
  • speed of set intersections (Web App Server)
  • speed of set marshalling (Web App lt-gt DB Server)
  • Data structures tested
  • sorted (lists, Patricia trees)
  • unsorted (hashed sets, binary vectors)
  • fast prototyping (Python)
  • throw-away coding, organic-growth software
  • development model
  • typical search time gain 4.0 sec ? 0.2 sec
  • typical indexing time loss 7 hours ? 4 days
  • binary vectors found the best compromise
  • (for all types of sets)

37
Performance Benchmarks (2002)
  • Testing marshalling/intersection/union/unmarshalli
    ng
  • Bytecode interpreted language study (Python,
    Java)
  • Python faster than Java (mainly due to
    marshalling)
  • Machine code compiled language study (ML, Lisp)
  • OCaml, CMU CL 3 times faster than Python C libs
  • CMU CL best scalable intersecting 6M records in
    0.01 sec, 30M records in 0.04 sec
  • Data structure study
  • OCaml, 3,000,000 records bit vectors 0.43 sec,
    hashed sets 1.71 sec, lists 3.76 sec, Patricia
    trees do not scale well for dense sets
  • Python fast enough for production (1M records)
  • fast C modules Numeric (byte/bit), Marshal, Psyco

38
The of Python
  • Clean aesthetical language
  • Easy to learn, important for many internship
    students and temporary members working on the
    project
  • Very good for rapid prototyping organic-growth
    development
  • Plenty of ready-to-be-used modules
  • Bytecode-compiled only, speed okay for our needs

39
Use Python?
40
(No Transcript)
41
Dispersed Teams
42
Dispersed teams
43
(No Transcript)
44
(No Transcript)
45
(No Transcript)
46
(No Transcript)
47
(No Transcript)
48
At Last
49
What I Learned
  • Python is not just a language for scripting and
    glue code
  • Fully fledged, highly engineered frameworks can
    be written in Python
  • Frameworks and component architectures are
    established practices
  • Frameworks tend to be domain specific
  • All very similar to each other and share many
    design patters
  • Many concepts common to modern HEP-framework
    architectures
  • BusinessDomainLanguages are essential
  • Python has the expressive power to implement them

50
What I learned
  • What can be reused?
  • Experience, patterns,
  • Provided one has a common culture
  • Low level components
  • Plugin components
  • Provided that the interface is NOT
    business-domain specific
  • LHC is not anymore at the frontier of distributed
    collaboration
  • There are Individuals/Labs/Companies which value
  • Sharing information
  • Building reusable software components
  • Cooperating in developing the basic building
    blocks
  • Become a community around such a common ground

51
More?
  • Visit
  • http//vanrees.org/weblog/topics/europython
  • http//indico.cern.ch/conferenceDisplay.py?confId
    44
  • http//www.europython.org/
  • http//www.google.com/search?qeuropython
Write a Comment
User Comments (0)
About PowerShow.com