Make your PHP 7 faster (GCC PGO)

Make your PHP 7 faster (GCC PGO)

[[137597]]

We have been working hard to improve the performance of PHP7. Last month we noticed that GCC PGO can bring nearly 10% performance improvement on WordPress, which makes us very excited.

However, PGO, as its name suggests (Profile Guided Optimization, you can Google it if you are interested), requires some use cases to get feedback, which means that this optimization needs to be bound to a specific scenario.

What you optimize for one scenario may not work for another scenario. It is not a universal optimization. So we cannot simply include these optimizations, nor can we release PGO-compiled PHP7 directly.

Of course, we are trying to find some common optimizations from PGO and then manually apply them to PHP7, but this obviously cannot achieve the effect that can be achieved by special optimization for a scenario, so I decided to write this article to briefly introduce how to use PGO to compile PHP7, so that your compiled PHP7 can make your own independent application faster.

First, you need to decide which scenario to use to give feedback to GCC. We usually choose: the page with the most visits, the most time-consuming, and the most resource-intensive page in the scenario you want to optimize.

Take WordPress as an example, we choose the homepage of WordPress (because the homepage is often the most visited).

Let's take my machine as an example:

  1. Intel(R) Xeon(R) CPU X5687 @ 3.60GHz X 16 (Hyperthreading),
  2. 48G Memory

php-fpm uses a fixed 32 workers, and opcache uses the default configuration (be sure to remember to load opcache)

Taking WordPress 4.1 as the optimization scenario..

First, let's test the current performance of WP in PHP7 (ab -n 10000 -c 100):

  1. $ ab -n 10000 -c 100 http: //inf-dev-maybach.weibo.com:8000/wordpress/  
  2. This is ApacheBench, Version 2.3 <$Revision: 655654 $>
  3. Copyright 1996 Adam Twiss, Zeus Technology Ltd, http: //www.zeustech.net/  
  4. Licensed to The Apache Software Foundation, http: //www.apache.org/  
  5.  
  6. Benchmarking inf-dev-maybach.weibo.com (be patient)
  7. Completed 1000 requests
  8. Completed 2000 requests
  9. Completed 3000 requests
  10. Completed 4000 requests
  11. Completed 5000 requests
  12. Completed 6000 requests
  13. Completed 7000 requests
  14. Completed 8000 requests
  15. Completed 9000 requests
  16. Completed 10000 requests
  17. Finished 10000 requests
  18.  
  19. Server Software: nginx/ 1.7 . 12  
  20. Server Hostname: inf-dev-maybach.weibo.com
  21. Server Port: 8000  
  22.  
  23. Document Path: /wordpress/
  24. Document Length: 9048 bytes
  25.  
  26. Concurrency Level: 100  
  27. Time taken for tests: 8.957 seconds
  28. Complete requests: 10000  
  29. Failed requests: 0  
  30. Write errors: 0  
  31. Total transferred: 92860000 bytes
  32. HTML transferred: 90480000 bytes
  33. Requests per second: 1116.48 [#/sec] (mean)
  34. Time per request: 89.567 [ms] (mean)
  35. Time per request: 0.896 [ms] (mean, across all concurrent requests)
  36. Transfer rate: 10124.65 [Kbytes/sec] received

It can be seen that WordPress 4.1 currently on this machine, the QPS of the home page can reach 1116.48. That is, it can process so many requests for the home page per second.

Now, let's start teaching GCC to compile PHP7 to run faster than WordPress4.1. First of all, GCC 4.0 or above is required, but I recommend everyone to use GCC-4.8 or above (now GCC-5.1).

The first step is to download the PHP7 source code and then do ./configure. There is no difference between

Now here's the difference, we have to compile PHP7 first, to make it generate the executable file that will generate the profile data:

  1. $ make prof-gen

Note that we use the prof-gen parameter (this is specific to PHP7 Makefile, don’t try this on other projects :) )

Then, let's start training GCC:

  1. $ sapi/cgi/php-cgi -T 100 /home/huixinchen/local/www/htdocs/wordpress/index.php >/dev/ null  

That is, let php-cgi run the homepage of wordpress 100 times, and generate some profile information in the process.

Then, we start compiling PHP7 for the second time.

  1. $ make prof-clean
  2. $ make prof-use && make install

OK, that's it, PGO compilation is complete, now let's take a look at the performance of PHP7 after PGO compilation:

  1. $ ab -n10000 -c 100 http: //inf-dev-maybach.weibo.com:8000/wordpress/  
  2. This is ApacheBench, Version 2.3 <$Revision: 655654 $>
  3. Copyright 1996 Adam Twiss, Zeus Technology Ltd, http: //www.zeustech.net/  
  4. Licensed to The Apache Software Foundation, http: //www.apache.org/  
  5.  
  6. Benchmarking inf-dev-maybach.weibo.com (be patient)
  7. Completed 1000 requests
  8. Completed 2000 requests
  9. Completed 3000 requests
  10. Completed 4000 requests
  11. Completed 5000 requests
  12. Completed 6000 requests
  13. Completed 7000 requests
  14. Completed 8000 requests
  15. Completed 9000 requests
  16. Completed 10000 requests
  17. Finished 10000 requests
  18.  
  19. Server Software: nginx/ 1.7 . 12  
  20. Server Hostname: inf-dev-maybach.weibo.com
  21. Server Port: 8000  
  22.  
  23. Document Path: /wordpress/
  24. Document Length: 9048 bytes
  25.  
  26. Concurrency Level: 100  
  27. Time taken for tests: 8.391 seconds
  28. Complete requests: 10000  
  29. Failed requests: 0  
  30. Write errors: 0  
  31. Total transferred: 92860000 bytes
  32. HTML transferred: 90480000 bytes
  33. Requests per second: 1191.78 [#/sec] (mean)
  34. Time per request: 83.908 [ms] (mean)
  35. Time per request: 0.839 [ms] (mean, across all concurrent requests)
  36. Transfer rate: 10807.45 [Kbytes/sec] received

Now we can process 1191.78 QPS per second, which is an improvement of ~7%. Not bad (Hey, didn’t you say 10%? How did it become 7%? Haha, as I said before, we tried to analyze what optimizations PGO has done, and then manually apply some common optimizations to PHP7. So in other words, the ~3% of more common optimizations have been included in PHP7, of course this work is still ongoing).

So it’s that simple. You can use the classic scenarios of your own products to train GCC. With just a few simple steps, you can get an improvement. Why not?

<<:  I asked the programmer goddess for her QQ number, but...

>>:  Hprose for Node.js 1.6.0 released

Recommend

Apple isn't at CES, but HomeKit is everywhere

[[125938]] The smart home market is very hot now,...

How to use the media to create internet celebrity products?

Every company wants to build its own internet cel...

Apple iOS 15 is upgraded but it seems not upgraded

In the early morning of June 8th, Beijing time, t...

8 ideas for Baidu promotion plan

In fact, promoting the Internet is not difficult,...

Why are Capybaras so keen on rolling in the mud in the hot summer?

Yeah? Which two little guys are standing on the r...

A fan's dream of taking off!

Follow Captain Da Shanzha Wan Set sail for the st...

New Toutiao traffic strategy

Recently I found that some friends wanted to attr...

How can artificial intelligence be applied in reality? Google is the key

The annual Google I/O developer conference arrive...

Two dimensions teach you how to spread high-quality content?

In an era where content is king, the value of hig...