Case Study Distributed Data Integration Framework Roger Ruttimann Lead Engineer Enterprise Systems, GroundWork Opensource Inc. - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Case Study Distributed Data Integration Framework Roger Ruttimann Lead Engineer Enterprise Systems, GroundWork Opensource Inc.

Description:

One component is the Open Source Project called Nagios. Runs on Linux/Unix only ... Documentation needs to be written! Training of staff installing and ... – PowerPoint PPT presentation

Number of Views:77
Avg rating:3.0/5.0
Slides: 28
Provided by: gwfoundati
Category:

less

Transcript and Presenter's Notes

Title: Case Study Distributed Data Integration Framework Roger Ruttimann Lead Engineer Enterprise Systems, GroundWork Opensource Inc.


1
Case StudyDistributed Data Integration
FrameworkRoger Ruttimann Lead Engineer
Enterprise Systems, GroundWork Opensource Inc.
4th International Conference on Computer Science
and its Applications (ICCSA-2006)
2
Objective
  • Overview of integration of Open Source projects
    into the development process
  • Design, risk assessment, and implementation of a
    new product, leveraging OSS as much as possible
  • Discuss problems with this approach

3
Agenda Details
  • Case study of the development process for a Data
    Integration Framework for Monitoring
  • Project requirements overview
  • Design
  • Risk assessment
  • Implementation
  • Encountered problems / issues
  • Project life cycle and project maintenance
  • Lessons learned
  • Q A

4
Project overview
Overview
  • The company offers support and installation
    assistance for an Open Source Monitoring system.
    One component is the Open Source Project called
    Nagios.
  • Runs on Linux/Unix only
  • Data storage in text files
  • UI compiled (C) classes parsing through text
    files
  • Hard to scale and limited possibilities to
    improve User Interface
  • The limitations to scale out and the User
    Interface are the two major issues hindering the
    adaption in larger installations

5
Project Requirements
Overview
  • The goal was to come up with a framework that
  • leverages the core features of Nagios such as the
    monitor-plugins, scheduler and the notification
    engine.
  • Extends the UI and the back end so that it can be
    deployed into larger data centers.
  • Has a generic data model so that other monitoring
    data can be integrated.
  • Uses an enterprise-type back end, including
    fail-over, load-balancing and high throughput.

6
Mission The CTO said...
Overview
  • Enable integration with multiple open source and
    commercial monitoring tools
  • Provide a platform for a unified enterprise-class
    solution
  • Provide real open source flexibility and
    extensibility
  • Publish the Monitor Data Integration Framework as
    an Open Source project so that outside developers
    can contribute

7
Development Constraints
Design Phase
We are a startup company with limited development
resources and an aggressive schedule - we had to
use existing components. As a new company we
didn't have legacy libraries for re-use. The best
alternative was to leverage Open Source
components as much as possible.
8
Final Feature set
Design Phase
  • Cross-platform application written in Java
  • Data exchange with XML feeder framework
  • Pluggable data normalization components
  • Java, Perl and PHP APIs for accessing data
  • Property-driven data structure for great
    flexibility

9
(No Transcript)
10
API Layer
Design Phase
Lightweight Object Container
Java API
PHP API
Perl API
Data Access Objects (DAO)
Object Relational Bridge
Data Model
11
Data Feeder / data normalization layer
Design Phase
Data Model
Lightweight Object Container
Object Relational Bridge
Data Access Objects (DAO)
Adapter Normalizer
Adapter Normalizer
Adapter Normalizer
Adapter Normalizer
Listener / Message dispatcher
XML Message
Feeder Perl script
Feeder PHPscript
Feeder JMS
Feeder C/C
Feeder VB
12
Common Data model
Design Phase
Application Programming Interfaces
Common Data Model
Event Data
Log Data
Properties
Properties
State Data
Properties
Collector Normalizers
13
How to choose the components?
Evaluation / Risk assessment
  • Choose point solutions with minimal dependencies
  • Business layer should be database agnostic
  • persistence layer should not depend on specific
    transaction managers or connection pools
  • Multiple projects with same functionality
    available
  • Easier to replace component if problems occur
  • License compatibility

14
Choosing the Business Logic to database bridge
Evaluation / Risk assessment
  • Requirements
  • Database agnostic. Not using stored procedures
  • Property based data model requires a lot of cross
    tables joins to insert and retrieve data.
    Developers are used to manipulate objects rather
    than record sets.
  • For performance reasons a cache is required.
  • Data consistency requires Transaction support
  • Hibernate -- www.hibernate.org
  • High performance object/relational persistence
    and query service.
  • Most popular and stable O/R persistence tool
  • Online documentation and books available.
  • Active mailing lists and forums

15
Choosing the Database
Evaluation / Risk assessment
  • Requirements
  • Easy to install
  • popular and accepted
  • multi platform support
  • MySQL -- dev.mysql.com
  • Most popular Open Source database
  • Easy to install and to maintain
  • Download, install and up-and-running in 15
    Minutes
  • Online documentation and books available.
  • Active mailing lists and forums

16
Choosing the Lightweight object container
Evaluation / Risk assessment
  • Requirements
  • Framework to manage Java Bean objects creation
    and maintenance
  • minimal configuration at run time
  • Flexible to support aspect oriented programming
    (aop) and transaction management
  • Spring -- www.springframework.org/
  • Lightweight container far smaller footprint than
    any available J2EE container.
  • Configuration through XML format assemblies that
    can be injected at any time.
  • Seamless integration of Hibernate for transaction
    management.
  • Online documentation and books available.
  • Active mailing lists and forums

17
Risk assessment
Evaluation / Risk assessment
  • Choose popular and well documented projects
  • Monitor forums to observe common user issues
  • Large traffic alone doesn't indicate successful
    project
  • Consider only stable and documented features
  • Do extensive evaluation of core components but
    not tool/utilities components

Even following these rules doesn't prevent you
from surprises. Unstable fast changing projects
can negatively affect your overall schedule
18
(No Transcript)
19
Encountered issues / problems
Implementation
  • Java version. Clients were still running Java 1.3
    or Java 1.4.x. Java 5 offers improvements that we
    couldn't leverage.
  • By design all components are loosely coupled and
    therefore replaceable. This requires more upfront
    work to design the communication interfaces.
  • Documentation needs to be written!
  • Training of staff installing and supporting the
    framework.
  • Overhead of following Open Source projects to be
    informed about updates/problems that might affect
    the project

20
Project Lifecycle
Project Lifecycle
  • Feedback from the field needs to be integrated
  • Improvements / bugfixes from the various Open
    Source packages need to be evaluated and
    integrated.
  • Constant risk evaluation when integrating third
    party packages
  • Evaluate new Feature requests
  • How do they fit into the framework
  • Is there an Open Source package available
  • What's the license?
  • Can we integrate it easily? How much custom code?

21
Release
Project Lifecycle
  • Data integration Framework was released to Open
    Source as GroundWork Foundation
  • http//gwfoundation.sf.net
  • Used as a part of GroundWork Monitor Professional
  • Customized by other users to store state and
    event information not directly related to
    infrastructure monitoring.
  • Development goes on Milestone-Releases available
  • Since the project is public, developers have a
    responsibility to support users and guarantee
    stability

22
Did the chosen approach work out?
Project Lifecycle
  • Can we extend current design based on Open Source
    components?
  • Is the maintenance manageable since we integrated
    so many Open Source packages with their own
    lifecycle?
  • Is the built in flexibility really needed?

23
First design challenge Adding new features
Project Lifecycle
  • Integration of new features
  • Remote API (WebService)
  • Higher throughput. Feed 500-1000 Message/sec
  • Integration of other Monitor systems such as JMX

24
(No Transcript)
25
Second design challange Open Source package
upgrades
Project Lifecycle
  • Upgrade of core components
  • Hibernate update to version 3.1 (EJB 3.0
    compliant)
  • Springframework update to 2.0 (JMX
    support/enhanced aop)
  • Upgrade to Java 5
  • Open Source packages have dependencies
  • Log4j, commons, XML parsers,..
  • Have unit tests in place to catch any differences
    and incompatibilities early
  • Even if the upgrade is a drop-in update you
    should leverage any new features and improvements
  • Once again check the forums and the mailing
    lists!

26
Conclusion
Lessons learned
  • Without the usage of available Open Source
    components we wouldn't have been able to meet the
    aggressive release schedule.
  • Open Source Project evaluation and project
    monitoring needs to be built into development
    schedule
  • Mailing lists are a great help
  • Constant learning projects change fast
  • Cleaner code since code is public developer
    pride!

27
More Info
  • Foundation Project
  • http//gwfoundation.sf.net
  • GroundWork Monitor Open Source
  • http//www.groundworkopensource.com/downloads
  • Contact
  • Roger Ruttimann
  • GroundWork Open Source, Inc.
  • 139 Townsend Street, Suite 100
  • San Francisco, CA 94107
  • rruttimann_at_groundworkopensource.com
Write a Comment
User Comments (0)
About PowerShow.com