Technical Challenges in Using Perl for Commercial Software Real Life War Stories - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Technical Challenges in Using Perl for Commercial Software Real Life War Stories

Description:

Motivation for threads. Faster performance, easier on the ... sigsetjmp() is not thread safe ... fork() rather than threads, where possible. Things to take home ... – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 28
Provided by: allu173
Category:

less

Transcript and Presenter's Notes

Title: Technical Challenges in Using Perl for Commercial Software Real Life War Stories


1
Technical Challenges in Using Perl for Commercial
SoftwareReal Life War Stories
  • Gurusamy Sarathy
  • ActiveState Corp.

2
Commercial Software Goals
  • Concurrency
  • Large number of things must happen at the same
    time
  • Robustness
  • Things cannot stop workingno excuses
  • Scalability
  • Organizations growsoftware must not only cope,
    but also aid in this
  • Compatibility
  • Must support older code

3
Two Cases in Point
  • PerlMx
  • Mail server product for email content management
  • PerlEx
  • Web server product to accelerate CGI
  • Many similarities
  • Server class products
  • Many requests must be serviced as quickly as
    possible
  • Business-critical component

4
Concurrency
  • Motivation for threads
  • Faster performance, easier on the system
  • Reduced memory footprint
  • Easier data sharing
  • On the other hand
  • Most code isnt thread-safe must reengineer
  • Locking needs discipline and introduces
    complexity
  • Harder to get right, fix bugs

5
Global/static data
  • Instant recipe for race conditions
  • findglobals - Script to find globals by studying
    nm or dumpbin.exe output from Perls object files
  • Must move globals into interpreter structure
  • Dont forget globals in XS code!
  • Addressed in 5.8 with MY_CXT macros
  • State is carried in the interpreter instead

6
Reentrant functions
  • Many APIs arent reentrancy safe (implies thread
    unsafe)
  • localtime() gt localtime_r()
  • asctime() gt asctime_r()
  • findrfuncs - Script to find reentrant functions
    in the standard headers
  • Configure detects and uses them in 5.8

7
Signals
  • Race conditions
  • e.g. system()
  • prev signal(SIGINT, SIG_IGN)
  • run child process
  • signal(SIGINT, prev) / ouch /
  • sigsetjmp() is not thread safe
  • Not really needed with safe signals in 5.8,
    setjmp() is sufficient

8
Environment
  • environ array gets munged when ENV is modified
  • Race condition when it happens from different
    threads
  • Possible solutions
  • Virtualize environment
  • Lock access to environ (still problematic)

9
Fast sv_gets()
  • Snooping stdio buffers
  • Efficiency hack
  • Breaks when multiple threads mess with same FILE
  • e.g. stdin

10
fork()
  • Interacts with locks
  • Only one thread recreated in child
  • Deadlock if it is not the thread that holds lock
  • Need to use pthread_atfork()
  • Has issues with dynamic loading e.g. mod_perl
  • Cant undo pthread_atfork()

11
dup2()
  • close(fd)
  • / ouch /
  • dup2(otherfd,fd)

12
fdopen()
  • fd socket()
  • FILE rf fdopen(fd, r)
  • FILE wf fdopen(fd, w)
  • . . .
  • fclose(rf)
  • fclose(wf) / ouch /

13
Robustness
  • Must cope with failures and carry on as much as
    possible
  • exit() and _exit()
  • No-no in a server environment

14
malloc()
  • Graceful handling of out of memory errors needed
  • Chicken-and-egg problems with interpreter
    creation and malloc() dependency on interpreter
  • Need a malloc() that is both thread-safe and
    scalable to large numbers of threads
  • vmem.h on windows

15
Arbitrary limits
  • Max file descriptors per process (often set
    ridiculously low)
  • Cant go past a certain value
  • stdio will have issues
  • select() may have issues (use poll() instead)
  • Maximum heap space per process
  • Perl eats lots of memory
  • Maximum stack space
  • Regex engine may recurse massively for certain
    patterns and dataneed lots of stack

16
Signals again
  • Two approaches to safety
  • Do absolute minimal work in handler
  • Set a flag when signal arrives
  • Check for it in a safe spot
  • Perl 5.8 has this
  • Handle signals in a dedicated thread
  • Signal handling thread blocks waiting for signals
  • Better model in threaded environments
  • Perl doesnt play nice, calls signal APIs directly

17
Memory leaks
  • Leaks from compile-time errors
  • Flawed design of OP allocation management
  • Closures have typically been a problem
  • Reference counting cycles

CV
pad
CV
18
Portability issues
  • Is it a thread or process?
  • getpid() on Linux
  • Deleting files in use
  • Access denied on Windows
  • ETXTBSY on HP-UX
  • Debugging crashes
  • strace doesnt do threads on Linux
  • Solaris has nicer support
  • pfiles, pstack et al.
  • Coredumps on Linux unreliable

19
Scalability
  • Minimize locking
  • Only locking is for reference counting the OP
    tree, which can be shared when interpreter is
    created via perl_clone()
  • User code can kill scalability
  • flock()
  • print() can have unintended consequences
  • Typical problem area is log files

20
Limits
  • Every system has limits (just a matter of scaling
    high enough before you hit it)
  • Need better control over what happens when limits
    are reached
  • Heap
  • Stack
  • TCP buffers

21
Interpreter Startup
  • Can be very slow
  • Pileup effects with on-demand creation
  • Slow application startup when creating everything
    at once

22
Repetitive work
  • Multiple interpreter creation
  • Parsing same code at startup to create each
    interpreterslow
  • perl_clone() should help
  • Has issues with regex state being in OP tree in
    Perl 5.6
  • 5.8 should fix it

23
Source Compatibility
  • THX macros add an implicit context pointer where
    needed
  • Made it possible to retain old function
    signatures
  • Context pointer consistently available
  • Autogenerated compatibility defines
  • iperlsys.h abstractions help virtualize system
    access
  • Overrideable function table
  • All calls to system are indirected through table
  • Different host systems may provide different
    systemic functionality

24
Binary compatibility
  • Interpreter structure
  • Changes broke binary compatibility
  • PL_foo access through Perl_Ifoo_ptr()
  • ret_type Perl_Ifoo_ptr(pTHX)
  • return PL_foo
  • define PL_foo (Perl_Ifoo_ptr(aTHX))

25
Platform issues
  • SMP bugs (Linux)
  • pthread bugs (Linux, FreeBSD)
  • libc bugs (Linux, Windows)
  • Silly stdio limits (Solaris, Windows)
  • How we cope
  • fork() rather than threads, where possible

26
Things to take home
  • Design for concurrency and scalability!
  • Cannot retrofit or reengineer for this easily
  • Eat your own dogfood
  • Fundamental QA principle
  • Stress test business-critical software
  • Use SMP hardware for testing
  • Build a beta cycle into release process

27
Questions?
Write a Comment
User Comments (0)
About PowerShow.com