Perl 6 Internals - PowerPoint PPT Presentation

1 / 54
About This Presentation
Title:

Perl 6 Internals

Description:

'Here there be dragons' The big goals of perl 6's internals. Speed ... We have the full power of perl to draw on to do the parsing (Including the regex ... – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 55
Provided by: devP7
Category:
Tags: internals | perl

less

Transcript and Presenter's Notes

Title: Perl 6 Internals


1
Perl 6 Internals
  • Dan Sugalski
  • TPC 5.0
  • Here there be dragons

2
The big goals of perl 6's internals
  • Speed
  • Extendibility
  • Cleanliness
  • Compatibility
  • Modularity
  • Thread Safety
  • Flexibility

3
Some global decisions
  • The core will be in C. (Like it or not, it's
    appropriate for code at this level)
  • The core must be modular, so pieces can be
    swapped out without rebuilding
  • It must be fast
  • Long-term binary compatibility is a must
  • Your average perl coder or extension writer
    shouldn't need any info about the guts
  • Things should generally be thought out,
    documented, and engineered

4
The quick overview
  • Parser
  • Compiler
  • Optimizer
  • Runtime engine

5
Fully-laden Interpreter
Optimized Bytecode
Optimizer
Compiler
Interpreter
Parser
Syntax Tree
Unoptimized Bytecode
Precompiled Bytecode
6
The parser
  • Where the whole thing starts
  • Generally takes source of some sort and turns it
    into a syntax tree

7
The Bytecode Compiler
  • Turns a syntax tree into bytecode
  • Performs some simple optimization

8
The optimizer
  • Takes the plain bytecode from the compiler and
    abuses it heavily
  • An optional step, generally skipped for
    compile-and-go execution
  • Should be able to work on small parts of a
    program for JIT optimization

9
The Interpreter
  • Takes compiled (and possibly optimized) bytecode
    and does something with it
  • Generally that something is execute, but it might
    also be
  • Save to disk
  • Translate to another format (.NET, Java bytecode)
  • Compile to machine code

10
The Parser
  • Double, double, toil and trouble
  • Fire burn, and cauldron bubble

11
Parser goals
  • Extendible in perl
  • More powerful than what we have now
  • Retargetable
  • Self-contained and removable

12
Parsing perl isn't easy
  • May well be one of the toughest languages to
    properly parse
  • If we get perl right other languages are easy. Or
    at least easier
  • We have the full power of perl to draw on to do
    the parsing (Including the regex engine and
    Damian's Bizarre Idea de Jour)

13
The Compiler
  • Mmmmm, tasty!

14
From syntax tree to bytecode
  • The compiler takes a syntax tree and turns it
    into bytecode
  • Very little optimization is done here.
  • Optimization is expensive and optional
  • Pretty straightforwardthis isn't rocket science

15
The Optimizer
  • We can rebuild it.
  • Make it better, faster, stronger

16
The Optimizer
  • Takes plain bytecode and makes it faster
  • Does all the sorts of things that you expect an
    optimizer to docode motion, loop unrolling,
    common subexpression work, etc.
  • Will be an iterative process
  • This will be interesting, as perl's a pain to
    optimize
  • An optional step, of course

17
Things that make optimizing perl tough
  • Active data
  • Runtime redefinitions of everything
  • Really, really late binding (Waiting for Godot
    late)
  • Perl programmers are used to more predictable
    runtime characteristics than, say, C programmers.

18
The Interpreter
  • Polly want a cracker?

19
Interpreter goals
  • Fast
  • Tuned for perl
  • Language neutral where possible
  • Event capable
  • Sandboxable
  • Asynchronous I/O built in
  • Built with an eye towards TIL and/or native code
    compilation
  • Better debugging support than perl 5

20
The perl 6 interpreter is software CPU
  • Complete with registers and an assembly language
  • This can make translating perl 6 bytecode into
    native machine code easier
  • There's a lot of literature on building optimzing
    compilers that can be leveraged
  • While more complex than a pure stack-based
    machine, it's also faster
  • Opcode dispatch needs to be faster than perl 5
  • Opcode functions can be written in perl

21
CPU specs
  • 64 int, float, string, and PMC registers
  • A segmented multiple stack architecture
  • Interrupt-capable (for events)
  • Pretty much completely position
    independenteverything is referenced via
    register, pad entry, or name

22
The regex engine
  • The regex engine is going to be part of the perl
    6 CPU, not separate as it is now
  • A good incentive to get opcode dispatch fast
  • Makes expanding the regex engine a bit easier
  • Details will be hidden as a set of regex opcodes

23
A few words on the stack system
  • Each register file has an associated stack
  • All registers of a particular type can be pushed
    onto or popped off the stack in one go
  • Individual registers or groups of registers can
    be pushed or popped
  • The stacks are all segmented so we're not relying
    on finding contiguous chunks of memory for them
  • There's also a set of call and scratch stacks

24
Bytecode
  • Could you say that a little differently?

25
What is bytecode?
  • A distilled version of a program
  • Machine language for the PVM
  • Can contain a lot of 'extra' information,
    including full source
  • Designed to be platform independent
  • Should be mostly mappable as shared data (modulo
    the fixup sections)

26
Data Structures
  • Vtables and strings and floats, oh my!

27
Variables
  • Generically called a PMC
  • Bigger than Perl 5's base data structure
  • Synchronization data built-in
  • Same for all variable types
  • GC data is not part of base structure

Vtable Pointer
Flags
Data Pointer
Integer Value
Float Value
Synchronization
GC Data
28
Scalars
  • Built off the base PMC structure
  • Use the integer and float areas as caches
  • Data pointer points off to string, large int, or
    large float
  • Vtable functions determine how it all works

29
Arrays
  • Built off the base PMC structure
  • Data pointer points to array data
  • All perl 6 arrays are typed
  • May have an array of scalars, strings, integers,
    or floats
  • Array only takes up enough memory to hold their
    types

30
Hashes
  • Built off the base PMC structure
  • Data pointer points to array data
  • All perl 6 hashes are typed
  • May have a hash of scalars, strings, integers, or
    floats
  • Hashes only takes up enough memory to hold their
    types
  • Hashing function is overridable

31
Strings
  • Strings are sort of abstract
  • Perl 6 can mix and match string data (Unicode,
    ASCII, EBCDIC, etc)
  • New string types can be loaded on the fly

Buffer Start
Buffer Length
String Length
Flags
String Size
Encoding
Type
Unused
32
String handling
  • Perl 6 has no 'built-in' string supportall
    string support is via loadable libraries
  • There'll be Unicode, ASCII, and EBCDIC support
    provided (at least) to start

33
Numbers
  • Bigints and bigfloats share the same header
  • Arbitrary-length floating point and integer
    numbers are supported
  • Perl automagically upgrades ints and floats when
    needed

Buffer Pointer
Length
Exponent
Flags
34
Vtables
  • All variable data access is done through a table
    of functions that the variable carries around
    with it
  • This allows us faster access, since code paths
    are specialized for just the functions they need
    to perform
  • Isolates us from the implementation of variables
    internally
  • Allows special purpose behaviour (like perl 5's
    magic) to be attached without cost to the rest of
    perl

35
Vtables (cont'd)
  • Makes thread safety easier
  • A little bit more overhead because of the extra
    level of indirection, but the smaller functions
    make up for that
  • Vtable functions can be written in perl. (Each
    class with objects blessed into it will have at
    least one)
  • There may be more than one vtable per package

36
Vtables hide data manipulation
  • Pretty much all the code to handle data
    manipulation will be done via variable vtables
  • Ths allows the variable implementation to change
    without perl needing to know
  • Allows far more flexibility in what you can make
    a variable do
  • Shortens the code path for data functions and
    trims out extraneous conditionals

37
For exampleFetching the string value of a scalar
  • For scalars with strings
  • String get_str(PMC my_PMC)
  • return my_PMC-gtdata_pointer
  • For int-only scalar
  • String get_str(PMC my_PMC)
  • my_PMC-gtdata_pointer
  • make_string(my_PMC-gtinteger)
  • my_PMC-gtvtable int_and_string_vtable
  • return my_PMC-gtdata_pointer

38
Memory Management
  • Now where did I put that?

39
Getting headers
  • All the fixed-size things (PMCs, string/number
    headers) get allocated from arenas
  • All headers, with the exception of PMCs (maybe)
    are moveable by the garbage collector
  • Non-PMC header allocation is very fast
  • PMC allocation is only mostly fast

40
Buffer Management
  • Anything that isn't a fixed size gets allocated
    from the buffer pools
  • All buffered data, with the exception of data
    allocated in special pools, is moveable by the
    garbage collector
  • Because of GC, allocation is very quick

41
Garbage Collection
  • Bring out yer dead!

42
The perl 6 GC is a copying collector
  • Everything except PMCs is moveable in Perl 6
  • PMCs might be moveable too
  • We get a compact memory heap out of this, which
    allows for fast allocation
  • Perl 6 will release empty memory back to the
    system when it can
  • Refcounts are used only to note object lifetimes,
    not for GC
  • Refcounts, for the most part, are dead

43
GC considerations for Objects
  • Garbage collection and object death are now
    separate things
  • Perl's guarantee of timely object death is
    stronger
  • We still don't guarantee perfect collection (but
    it sucks less)
  • We still refcount for real perl references, but
    only 2 bits are used
  • Objects with more than two simultaneous
    references won't get collected until a full dead
    variable scan is made

44
Extensions beware!
  • Since we have no refcounts, extensions must tell
    perl when they hold on to PMCs
  • Not a huge deal, as we piggy-back on the
    cross-interpreter PMC tracking we use for threads
  • No more struct PMC in extensions...

45
Extending Perl 6
46
Extensions Made Easier
  • Perl 6 will have a real API
  • The API is multilevel
  • Simple for embedders
  • More complex for extension authors
  • Pretty messy for vtable or opcode writers
  • Binary compatibility is a very strong
    consideration

47
Embedding
  • Guaranteed stable and binary compatible for the
    life of perl 6
  • Very simple API
  • Create interpreter
  • Destroy interpreter
  • Parse source
  • Run code
  • Register native functions

48
Extensions
  • Much simpler interface to perl's internals
  • The gory details are hidden
  • Stable binary compatibility is a very strong goal
  • We may add functions or options, but we won't
    take them away
  • Extensions built for perl 6.0.1 should still run
    with perl 6.8.12 without rebuilding
  • Manipulating perl data should be much easier
  • If you have to resort to Inline to wrap a library
    then it means we've not got it right

49
Extensions (cont)
  • Inline, or something like it, is probably going
    to be the standard for extending perl
  • XS, when you have to resort to it, will be far
    less nasty than it is now

50
Homegrown Opcodes and Vtables
  • This is part of the grubby inside of perl 6
  • You can use any of the internal routines of perl
  • If you do, though, you may run into
    backward-compatibility issues at some point. (If
    it's not part of the embedding, utility, or
    extension API, we make no promises)
  • There's no guarantee that calling conventions
    won't change.
  • No guarantees that perl 6.4 will even use vtables
    or opcodes

51
Utility library
  • Perl 6 will provide a set of utility routines to
    handle common tasks
  • String manipulation
  • Encoding changes (Shift-JIS to Unicode, EBCDIC to
    ASCII)
  • Conversion routines (string to int or float)
  • Extended precision math (int and float)
  • These will be stable, like the rest of the API

52
Variations on a Theme
  • Tocatta and Fuge in perl minor by Wall

53
The source doesn't have to be perl
  • The parser isn't obligated to be parsing perl
  • Input source could be Python, Ruby, Java, or
    INTERCAL
  • The full perl parser is optional

54
The interpreter doesn't have to interpret
  • The interpreter is the destination for bytecode,
    but it doesn't have to interpret it
  • It might save directly to disk
  • It might translate the bytecode into an alternate
    formJava bytecode, .NET code, or executable
    code, for example
  • The interpreter might translate to machine code
    on the fly, as a sort of JIT compiler. (Well,
    really a TIL, but...)
Write a Comment
User Comments (0)
About PowerShow.com