The PHP podcast where everyone chimes in.

Originally aired on

July 7th, 2017

064: PHP 7 Source Code: A Deep Dive

We take a deep-dive into the underlaying structure of the the PHP source code and talk about the scanner, parser, the new AST layer (and the evil things we can do with it), and the Zend engine. Let's see how the PHP sausage is made!

with


PHP 7 Internals: Scanning, Parsing, AST, and engine Show Summary


Background

Previous podcasts about PHP internals provide some context for this episode:

PHP 7.2 alpha 3

  • Next release is beta 1
  • Feature freeze/feature slushie is impending
  • The retry keyword will not be in PHP 7.2

What is your setup when working on PHP internals?

Compiling PHP

  • PHP is not difficult to compile but if you are using PHP in userland, use a pre-compiled binary from a package manager
  • If you hit an issue with a pre-compiled binary, it cuts out a lot of ambiguity
  • PHP does not distribute a configure script because it is generated from other files in the repository. Requiring users to generate it removes the possibility of configure being out of sync.

The scanner and the parser

  • PHP uses re2c (regular expression to C) to generate a scanner (or lexer) and Bison to generate a parser
  • The scanner splits the input PHP code into tokens
  • The parser groups and arranges those tokens into meaningful expressions
  • In PHP 5, the parser directly converted parsed expressions to opcodes
  • In PHP 7, it is converted to an Abstract Syntax Tree (AST)
  • Using an AST allows the compiler to perform optimisations
  • Sara's blog post about the compiler processes
  • Sara's blog post about lexers and parsers

How does the executor (or VM) work?

  • In PHP 4, this was basically a huge while loop with a switch statement in it:
    • Loop over each opline
    • switch statement defines behaviour for each opcode
  • PHP 5 used a call VM:
    • We still loop over each opline
    • Instead of the big switch, each opline has a handler field containing a pointer to a callback function
  • In PHP 7, the call VM is still in use but other styles are also used, including computed goto
  • The Zend VM is generated by a PHP file. You can't build the PHP executor without PHP

Testing

  • PHP internals has ~15,000 tests
  • Running the full test suite doesn't support parallelisation - they all run in series
  • PHP TestFest aims to encourage people to add tests to PHP Internals
  • The test suite for PHP is written in PHP - you don't need to know C to contribute tests
  • PHP Code Coverage allows us to discover PHP internals code that is missing tests

PHP Internals Documentation

Sammy Kaye wraps up with

Sara Golemon

  • Sara is @SaraMG on Twitter
  • She works on getting XP in video games and torturing PHP
  • Works on the XHP extension for PHP - XHTML embedded in PHP

Developer Shout-Out

The Developer Shout-Out recognizes developers in the community for their contributions.

For this episode the panel guests, Sara nominated Julien Pauli for the Developer Shout-Out segment.

Thank you, Julien Pauli for all your hard work on php-src. A $50 Amazon gift card is on its way to you.

$50 Amazon gift card sponsored by Zend Training

Zend Training

Professional Training for Professional PHP developers

Show Notes Credit

Chris Shaw

Thank you Chris Shaw for authoring the show notes for this episode!

If you'd like to contribute show notes and totally get credit for it, check out the show-notes repo!