Title: Running QAD MFG/PRO on RedHat Linux Enterprise Server in a Cluster Environment
1Running QAD MFG/PRO on RedHat Linux Enterprise
Serverin a Cluster Environment
- Lauren Heck
- Manager, NA Braking Developers TRW Automotive
- Dave Truchan
- System Specialist Perot System Corporation
2Agenda
- Who are we
- Why Linux?
- Hardware
- Architecture
- Issues
- Where are we today
- Questions?
3Who are we?
- TRW Automotive
- 8th Largest Tier 1 Supplier
- 11 billion in Sales
- 50,000 employees
- Manufacture Brakes, Steering and Suspension
Systems, Seat Belts, Air Bags, Engine Valves,
Fasteners - Formally known as Kelsey-Hayes, Varity
Kelsey-Hayes, Lucas Varity, TRW, Or Northrop
Grumman
4QAD Installations Worldwide
- North America
- Braking 14 Sites
- Engine Components 5 Sites
- Europe 7 Sites
- Asia 11 Sites
- Brazil - 3 Sites
5Information Technology Systems Architecture
Pdm system
BOM koblenz
Indirect POs
Blanket Releases / Vendor info
Ship/Sales
PO Info / Items
T E
Inventory Requirement
Freight Cojistics
MRO buyer
Royal bank checking
Shared Services
Payroll Info
700 Progress Programs 150 Shell Scripts
Ceridian
GL Info
TACC
Vouchers
Project Info
Receipts
Confirm Shipment
to supplier
ASNs to customer
Delivery/Quality Info
Sales Order Customer Info
to supplier
Frenos check Programs Banamex
SupplyWEB
Customer Releases
Supplier Schedule
BPO / Releases, SMI, Kanban, Receipts
EFT
Inspection System
ASN
Supplier ASNs
Supplier Schedules
Customer Releases Remittance Advice Purchase
Orders JIT Schedules
Supplier POs
Schedule
Dealer directs Core part
EFT
JP Morgan
6Project Timeline
- 11/2002 EB2 Project Approved
- 12/2002 Decision to move to Linux
- 12/2002 Hardware/Software Ordered
- 01/2003 Linux Development Environment setup
- 02/2003 Test Databases created and converted
- 03/2003 Hardware/Software setup
- 04/2003 Test databases setup
- 07/2003 1st plants go live
- 08/2003 2nd set of plants
- 10/2003 3rd set of plants
- 11/2003 4th set of plants
- 12/2003 5th set of plants
- 02/2004 Last set of plants
7Major Upgrade Changes
- MFG/PRO version 8.5 f ? MFG/PRO
version EB2 - Progress Version 8.3c ? Progress
Version 9.1D07 - HP UX 11 ?
RedHat Linux Advanced -
Server 2.1 - HP Servers ?
Dell Servers - Trinary BPIs ?
EDI E-Commerce Gateway - PC Based Shipping System ? Web Based
Shipping App - Telnet ?
QAD Desktop 2
8Risks
- No Linux Experience
- QAD could not point to a single customer who
attempted an implementation on Linux to this
scale - RedHat Linux in a Cluster Environment
- A lot of unknowns
9EB2 Hardware
- Corporate
- 4 Dell 6650 4 Processor Servers, 12G RAM
- 4 Dell 2600 2 Processor Web Servers, 1G RAM
- Qlogic 2340 Fiber Cards
- 1 EMC Clarion CX600 Disk Array, 8G Cache, 2
24 port DS-24 - McData switches, 1 Terabyte
- Fenton
- 2 Dell 2650 2 Processor Servers, 2G RAM
- 1 Dell Powervolt 220S Disk Array, 540G
- Perc 3 Raid Controller Cards
- Fowlerville
- 2 Dell 2650 2 Processor Servers, 2G RAM
- 1 Dell Powervolt 220S Disk Array, 540G
- Perc 3 Raid Controller Cards
- Development Server
- 1 Dell 6400 4 Processor Server, 4G RAM,
180G Disk - Test Servers
- 2 Dell 6650 4 Processor Servers, 12G RAM
- 2 Dell 2650 2 Processor Servers, 1G RAM
10QAD Desktop 2 Layout
QAD Desktop Application Server
Embedded Telnet process
Tomcat Server (Servlet Container)
Web Server
WebSpeed Remote Messenger
Connection Manager
Host name and port
Telnet Process
Telnet Process
Telnet Process
Telnet Process
Telnet Process
Telnet Process
Telnet Process
Name Server
MFG/PRO Application Code - _progres
WebSpeed Broker
Oracle or Progress Database
WebSpeed Agent
WebSpeed Agent
WebSpeed Agent
MFG/PRO Database Server
11Corporate Architecture (4) 2 Node Clusters
Dell 6650 (3 Plants) Webspeed Broker Name
Server QAD Software Progress
Dell 6650 (3 Plants) Webspeed Broker Name
Server QAD Software Progress
Dell 6650 (3 Plants) Webspeed Broker Name
Server QAD Software Progress
Dell 6650 (3 Plants) Webspeed Broker Name
Server QAD Software Progress
EMC CX600 STORAGE ARRAY
Dell 2600 Desktop2 Server Webspeed Remote
Messenger Apache Tomcat
Dell 2600 Desktop 2 Server Webspeed Remote
Messenger Apache Tomcat
Dell 2600 Desktop 2 Server Webspeed Remote
Messenger Apache Tomcat
Dell 2600 Desktop 2 Server Webspeed Remote
Messenger Apache Tomcat
12Plant Setup
- Each Plant has its own production and training
QAD Databases. - Each plant has a production and training QAD
Admin database. - Each Plant has 2 custom production and training
databases. - The QAD Help database is shared by all the plants
on the server.
13Corporate Architecture (4) 2 Node Clusters
Canada Corp. Livonia Corp. Wixom
Chesapeake Fayette Mt. Vernon
Kingsway Santa Rosa Woodstock
Brighton Jackson Queretaro
EMC CX600 STORAGE ARRAY
Qadweb1.livmi.trw.com
Qadweb2.livmi.trw.com
Qadweb3.livmi.trw.com
Qadweb4.livmi.trw.com
14Failover Architecture
Woodstock Santa Rosa Brighton Jackson Queretaro
Server Down
Canada Corp Livonia Corp Chesapeake Fayette Mt.
Vernon Wixom
Kingsway
EMC CX600 STORAGE ARRAY
Server Down
Qadweb2.livmi.trw.com Qadweb1.livmi.trw.com
Qadweb3.livmi.trw.com Qadweb4.livmi.trw.com
Server Down
15Corporate Cluster/SAN Architecture
- (38) 36G 15,000RPM Drives
- (4) 8 Disks Raid 10 Groups
- (5) 36G Disks setup in Raid 5
- (15) 73G Disks Raid 5 for Development System
- (3) Hot Spares
- Each plant has its own set of Luns (Database, AI,
BI) - Each plant has its own IP Address
- Can fail over 1 plant or all plants on the
cluster - Using EXT3 File System
- Running EMC Powerpath
- Using Legato Networker for backups
16Fenton and Fowlerville Cluster Architecture
Dell Powervolt 220S
17Fenton/Fowlerville Cluster Architecture
- Each plant has its own set of Partitions
(Database, AI, BI) - Dell Powervolt 220s has 14 36G Drives
- 12 are 36G Raid 10
- 2 are 36G Raid 1 (Quorum Disk only for
Cluster Stability) - Perc 3 Controller Cards
- No Read Ahead
- Write through
- Mass Storage
- Each plant has its own IP Address
- EXT3 File System
- Using Backup Exec 9 for backups
18Issues
- Driver Issues
- Kernel Issues
- Vendor Support Issues
- Device Renaming Issues
- Kswapd Issues
- DR Issues
- Cluster Administration Issues
- Printing Issues
19Driver Issues
- Need megaraid_2002 driver for Dell 2600
- Get Driver from RedHat
- Load Driver during installation
- Network Card Lockups using broadcomm driver
- Redhat recommends using tg3 driver instead
of - broadcoam bcm5700 driver
- Dell 2600 and powervault 220s lock up. Not very
stable. - System would lockup during heavy I/O
- Clustering software unstable
- Servers rebooting
- Swapped with Dell 2650s
20Kernel Issues
- Kernel instability prior to e.18 (e.24 production
kernel). - Servers intermittently rebooting
- Difficult to reproduce
- Paid RedHat to send engineer on site
- Determines kernel is unstable
- Applies beta kernel
- Works with kernel programmers
- Modified beta kernel to fix instability
issues
21Vendor Support Issues
- Vendor support issues
- Finger pointing
- No single point of contact
- No experience, no one has ever done this
before
22Device Renaming Issues
- Device renaming issues
- Linux does not automatically write
signatures on its disks to - keep track of them
- Dell Engineer started project call
Devlabels to address this - issue. Excellent white paper available at
linux.dell.com - EMC Powerpath addresses this issue as well
23Kswapd Issues
- Tuning kswapd
- kswapd utilizing 100 of CPU
- System response time very sluggish
- Changed kernel parameter vm.pagecache from 2
70 90 to - 2 50 70 per RedHat recommendation
24Disaster Recovery Issues
- Mondo/Mindi Rescue issue
- Open Source Disaster Recovery Project for
Linux - www.mondorescue.org
- Software creates bootable ISO images
- Add insmod to /etc/mindi/deplist.txt
-
25Cluster Administration Issues
- Cluster administration issues
- Keeping cluster in sync
- /etc
- cron jobs
- printers
- source code
- Progress and QAD software
- Dealing with .lk files after crash
- Dealing with (1180) shared memory error after
database fails over - Run prostrct repair to remove the shmid from
the masterblock - per Progress Solution ID P20475
- Have to stop tomcat first, before
shutting down database. - lynx dump accept_all_cookies
authuserpassword - http//url8080/manager/html/stop?path/w
ebapp - Tuning cluquorumd polling interval via cludb
command
26Printing Issues
- Printing issues
- LPRNG issues. Waiting for subserver to
exit error. - Wrote shell script to address issue
- Looking into using cups.
27Things to Think About
- There is a cost to move to Linux (Progress) if
you are not on bundled pricing. - 3rd Party Applications.
- Check Linux Distribution hardware and vendors
hardware compatibility pages too see if hardware
has been certified to run on that OS. - For support reasons, consider buying everything
from a single vendor. - Have adequate test system. In some cases it maybe
necessary to duplicate your production system so
you can test adequately.
28Where are we Today
- Environment Stable
- Jobs run faster
- Implementing LDAP
- Looking at Suse
- Purchased Gold Support from Dell so we have
single point of contact - Investigating cluster filesystem such as polyserv
- Get more users to use Desktop2
29Questions?