path issue when using phar #115
The first script works fine for me with OPcache.
However, I'm not sure what do you ,mean by MHE. Probably, it is the real reason of the problem, but I don't know how to reproduce it.
Anyway, it must be more robust to substitute './' with __DIR__ constant.
sorry i didn't explain the problem correctly,
I run a small web hosting server for friends and they have 100's of websites using PHP 5.5 with OPcache cause problems with scripts that use relative paths IE (./../) for example lets say most of them use wordPress or zend framework and someone made awesome plugin that is contained in phar file, if you use relative paths it somehow triggers a bug in OPcache that try to load the phar not from the website that requested the file but from the other website that has the same plugin.
i hope i made it clear enough for you to understand what i mean
PS: English is not my native tongue
ok i'll try to make testcase tomorrow once i get back to my workstation
Does ZendOptimizerPlus support Phars in general? I thought it would be better to ask here than open a new issue.
I'm running into this, or a similar issue. I'll try to produce a really reduced testcase which I'll append here, but let me give you an overview.
I have php-fpm running, serving two different versions of an application from two different directories.
Both application use a phar-file (which gets include()d directly) with the same name, but with different contents (the two applications are the same, but at different releases, so one phar file is newer than the other and between the two, some reorganization has happened).
Inside the phar file are multiple files which are included as part of the phar's stub.
Now the phar file itself is cached correctly, but their contents isn't.
Whichever phar file is loaded first has its contents stored in the cache which is then used to serve all future requests for files from an equally named phar file (but with different content).
ok. I can offer a reduced testcase now, though as it involves multiple vhosts, it's still a bit painful to set up. Sorry for that
First, opcache is configured with
opcache.use_cwd=1
opcache.revalidate_path=1
opcache.validate_timestamps=1
opcache.revalidate_freq=0So this is as "refreshy" as it gets.
Next, configure two vhosts using the files I've configured in this archive. Point each one to either the newer or older subdir.
The phar file in question is a php file in disguise (block.asset_compile.php - it's form my own sacy project which is a smarty plugin to compress assets. Both index.phps in both vhosts include the file and then run the plugin function with some mock data.
Now with a freshly restarted php-fpm, the first vhost you hit will work correctly, but the second one you hit will complain about a class not being found (it's either sacy_Config if you run the vhost named "newer" first, or sacy\Config if you run "older" first.
If you restart fpm and hit the respective other vhost first, the first hit, again, will work correctly, but the second one will fail.
As I said in my previous comment, I believe that opcache doesn't correctly namespace the files inside of the phar file, so the following statements from the phar stub:
include("phar://sacy.phar/sacy/ext-translators.php");
include("phar://sacy.phar/sacy/sacy.php");will immediately hit the cached variant, despite the actual phar file being located at different locations on the file system.
Thinking about this, I might even be able to reduce this testcase even further, brining this down to one vhost. If I can, I'll post another comment here.
Yes. My previous suspicions were correct - this doesn't require two vhosts at all.
If you use this archive (unpack to DOCUMENT_ROOT - I'm using a hard-coded include_path to show that it's not an include_path related issue) and you restart php-fpm, then you can either run using_older.php or using_newer.php, but whichever you run first is the only one you'll be able to run.
When you restart fpm (to flush the cache), then you can switch the file. The first one will always be fine, the second one will always fail.
So I stand by my initial investigation: opcache seems to take the symbolic name (using Phar::mapPhar) inside the phar as a cache key without taking the actual phar file's location into account.
I suspect OPCache can't make difference between Phar aliases created using Phar::mapPhar(),
and I don't see a way to solve it.
To workaround the problem it's probably possible to use a full path name to current file.
e.g.
$phar = __FILE__ . "/sacy.phar";
Phar::mapPhar($phar);
include("phar://$phar/sacy/ext-translators.php");
include("phar://$phar/sacy/sacy.php");
Yeah. I agree. The thing is though: In plain PHP, this works and so it does in 5.4 and earlier with APC, so this must at least be documented somewhere, or, considering that this is kinda dangerous, opcache should opt out of caching/optimizing stuff inside a phar archive.
In shared hosting environments, this is going to bite somebody I'm sure.
IMHO, both Phar and OPcache are core components of PHP, so these two extensions should interoperate seamlessly; virtualization of the phar root is a core feature that enables abstraction of any phar'ed application so I don't feel that the sort of workaround suggested by Dmitry is the right approach.
I generated a phar from the following PHP stub in /tmp and executed there as well
<?php
# phar.readonly must be 0
$stub = '<?php
Phar::interceptFileFuncs();
set_include_path("phar://this" . PATH_SEPARATOR . get_include_path());
require "index.php";
__HALT_COMPILER(); ?>';
$p = new Phar(dirname(__FILE__) . '/newphar.phar.php', 0, 'this');
$p['index.php'] = '<?php echo "index is brand new!\n"; require "hello.php";require "hello.php";';
$p['hello.php'] = '<?php echo "and hello from me!\n";';
$p->setStub($stub);
unset($p);I traced through the execution of both current PHP 5.4 and 5.5 versions with gdb. Both executions are functionally equivalent. At accel_deactivate() the ZCG(include_paths) table contains two entries ['A' => '.', 'B' => 'phar://this:.'] and the ZCG(hash) table contains three direct entries for:
- /tmp/newphar.phar.php
- phar:///tmp/newphar.phar.php/index.php
- phar:///tmp/newphar.phar.php/hello.php
So for so good, but it also includes indirect entries for:
- /tmp:index.php:/tmp:B
- /tmp:hello.php:phar:///tmp/newphar.phar.php:B
Hence subsequent fetches of both index.php and hello.php will match against the indirect keys entries (which map through the include_path onto the alias), with the entries pointing to the respective fully resolved entries. The issue comes when another phar is executed, say /tmp/otherphar.phar.php, and which also establishes an alias for this. The indirect entries will still be pointing the newphar entries, hence this bug.
The safest way of avoiding this would be for accel_make_persistent_key_ex() to explicitly resolve any phar URIs to convert any phar alias into its fully resolved absolute form. As far as I can see, there isn't a phar C callable entry to resolve a named alias within the current request (but I am not a phar expert); this is what we need to implement this. I could do some more investigation, but this is one that the phar maintainers should make know how to make quickly available.
If someone from the phar developers can suggest how to do this alias resolution, I can do the OPcache changes.
I've done some enumeration around use cases looking at OPcache internals. The issue is wider than I've described above. At its root, phar embeds the concept of aliases which can symbolically map names onto file system paths in the same way as file system symlinks do. The issue for OPcache is that aliases are scoped to the request and therefore consecutive requests can map any alias, e.g. this (as in the above example), to different paths. On the other hand OPcache assumes that the key <access name>:<current working directory>:<include path> is an invariant for the life of the cache.
The only safe way to address this is to assume that any phar://<some patten>/ might include such an alias and attempt to resolve it. At the moment the Phar extension doesn't provide this API function. It's only a few lines of C do this but this assumes access to Phar internal data structures. Better to add a C callable external (a dozen lines of C to be added to the Phar source) to some externally exposed header. Anyone including me code do this PHar code, but again this requires the endorsement of the Phar maintainers.
@dstogov, Dmitry you know who to pass this to. Can you please forward. Thanks.
I got a quick fix: https://gist.github.com/laruence/6988427
the sad thing is there is no active maintainer of phar now... :<
there isn't? Such a shame - phar is one of the coolest feature additions to PHP in the last few years.
@laruence, Thanks for this. Your patch is pretty much identical to the one I drafted, but since you've already put it up on gist, I'll switch to your version to keep it simple for everyone else, and do my tests based on it. I want to go through the various use cases that I've sketched out to make sure that this covers them all. My bandwidth re PHP is pretty low for the last few months, but that should get better net week when I return to the UK. I've busy doing too much walking and swimming on a Greek island, and my better half won't agree to us having Internet at our cottage, so I have to walk to a local Taverna to catch up over a jug of wine :LoL:
@pilif, +1 on Phar. I was aware of its functionality, but I hadn't really used it before I looked at it re this bug. It is an elegant package and as well as virtualizing applications, it gives material I/O performance benefits for Shared Hosting templates. The implementation is pretty straight forward. I wouldn't have implemented some bits as currently myself, but there's nothing that leaps out as meriting changing. I'd already started thinking about how to embedding file-based OPcaching to the Phar format and then it would really fly. Pretty straightforward technically, but we would need an extra opcode to defer binding of some "compile-time" constants such as __FILE__ and __DIR__ to load time.
I'll have a scan through the outstanding Phar issue list and see how much behind it is.
@laruence, can you point me to any threads on the background behind the Phar maintainer(s) deciding to move on. I might be interested in helping here, if this isn't a poison chalice.
I will do some tests with the patch proposed by @laruence - unfortunately I'm currently in the middle of something else, so my time for this is a bit limited, but I do have the patch integrated into my build infrastructure, so I can test it on a staging system, it's just that I currently lack the time.
@TerryE thanks, you are so welcome, for phar maintainer, you can write to [email protected] :)
and the patch I attached is just a quick fix, to show what the fix is. there still some room to improve, like, check whethe resolve path exists in opcache config.m4 side.
for now, @pilif I will wait for you feedback of the result of test :)
thanks
Both of my test-cases that I've attached to an earlier comment still fail with the patch applied :(
Hey, sorry for my previous mistake, I updated the patch, and tested, it should be work now: https://gist.github.com/laruence/6988427
The new patch (the last one in the gist - you've accidentally pasted a whole shell session
20% of our traffic is hitting opcache with your fix applied. No issues. I'd say that this is a valid solution. The question that remains now is whether the patch to phar will get accepted or not.
@pilif phar side has been committed. opcache side will commit after dmitry look into it. thanks
the fixes for php-5.5/ext/opcache and zendtech/opcache are different:
1. for php-5.5/opcache: https://gist.github.com/laruence/6988427
2. for zendtech/opcache: https://gist.github.com/laruence/7080980
the fix has been committed, thanks for spotting this.
anyway, our fix only works for later releases of PHP-5.5, since the key struct info of phar wasn't exposed before.
thanks
One question remains: How are changes to ZendOptimizerPlus merged back into opcache in PHP? Do I have to report the issue there too, or is this being dealt with?
never mind - I just noticed you commited the fix to php-src too. Thanks a lot :-)
@laruence, sorry but I've only got back to my home base a few days ago and this has been my first solid day for programming catchup.
This patch misses one usecase and that is when an alias is used as a absolute reference, for example require "phar://this/test.php";. This creates a ZCG(hash) entry using the key phar://this/test.php indirecting (in my previous test case) to the ZCG(hash) entry for "phar:///tmp/newphar.phar.php/test.php".
However, the Phar alias this is only valid for the scope of the request and might well point to a different phar on a subsequent requests.
OK, this isn't strictly a path issue as this is an absolute alias, so this issue can remain closed. Nonetheless it is an outstanding bug, so I'll code up the fix and post it back. If you want to leave this issue closed, I'll open another to cover it.
@laruence, please see https://gist.github.com/laruence/6988427/#comment-938419 - without this
<?php
# phar.readonly must be 0
$stub = '<?php
Phar::interceptFileFuncs();
require "phar://this/index.php";
__HALT_COMPILER(); ?>';
$p = new Phar( dirname(__FILE__) . '/x.phar.php', 0, 'this');
$p['index.php'] = '<?php
echo "Hello from Index.\n";
require_once "phar://this/hello.php";
';
$p['hello.php'] = "Hello World!\n";
$p->setStub($stub);
unset($p);generates ZCG(hash) entries for:
- /tmp/x.phar.php
- phar:///tmp/x.phar.php/index.php
- phar://this/index.php indirects to (2)
- phar:///tmp/x.phar.php/hello.php
- phar://this/hello.php indirects to (4)
and with the change:
- /tmp/x.phar.php
- phar:///tmp/x.phar.php/index.php
- phar:///tmp/x.phar.php/hello.php
which is what you want -- phar aliases are always resolved during accel_make_persistent_key.
Hi Terry,
I don't know all this PHAR magic well :(
As I understood, it creates a "this" alias, that is valid only in context of single PHAR file. right?
In case you see a problem, pleas provide a short script that demonstrates wrong OPcache behavior (PHPT script).
Dmitry, the Phar aliases are stored in PHAR_G(phar_alias_map) which is reinitialised on every request. Hence we can have the situation where:
- On request 1
this->/pathA/pharA.phar.phpand henceindex.php->phar:///pathA/pharA.phar.php/index.php - On request 2
this->/pathB/pharB.phar.phpand henceindex.php->phar:///pathB/pharB.phar.php/index.php
However because the ZCG(hash) exists for the image activation, current opcache will resolve all compile requests for index.php to the first of these, phar:///pathA/pharA.phar.php/index.php. For example
compile the following with php -d phar.readonly=0
<?php
# phar.readonly must be 0
$stub = '<?php
Phar::interceptFileFuncs();
require "phar://this/index.php";
__HALT_COMPILER(); ?>';
$dir = dirname(__FILE__);
foreach (['A', 'B'] as $v) {
$p = new Phar( "$dir/$v.phar.php", 0, 'this');
$p['index.php'] = "<?php
echo \"Entering Index $v.\n\";
include 'phar://this/hello.php';
";
$p['hello.php'] = "Hello World says I am $v!\n";
$p->setStub($stub);
unset($p);
}Running A.phar.php and B.phar.php from the CLI will work as expected, but if you move these files into a folder that is web accessible which a persistent opcache then requesting the A then the B variant shows the bug.
I have a CLI test script which shows this bug on my opcache, but that's because I can use the same cache for both variants. Unfortunately, AFAIK you can't use either of the run-tests.php or the server-tests.php variants to execute multiple scripts within the same using the same cache, and hence you can't cover the "already primed" cache execution paths in the standard test scripts. Or have I misunderstood something here?
Dmitry,
Unfortunately, AFAIK you can't use either of the
run-tests.phpor theserver-tests.phpvariants to execute multiple scripts within the same using the same cache, and hence you can't cover the "already primed" cache execution paths in the standard test scripts.
This is a potential issue for any package which carries state from one request to another, such as is the case for opcache. The second request can cover code paths which are simply not exposed in the first request processed. I've cobbled together a test stub which spawns a php -S 8888 process and then hits it with http://localhost:8888 requests to do this. Maybe it might be worth doing this properly. Need to think about this some more. If I recall correctly there are other tests which use this technique.
Do you think this is worth doing?
I am not sure how to fix this...
I am not sure what the specific "this" is here. If you mean the util.c:phar_resolve_alias() bug its just an issue of replacing the return 1 and return 0 to statements to return SUCCESS / FAILURE as I've marked up on your Issue115 patch. I can provide a patch of the patch if you want..
I meant the "index.php" resolving issue.
and for the "issue" you marked up: 785cd59#commitcomment-4494608
I am not sure, why are you thinking that it should return success instead of 1? as I replied, we use 0,1 in opcache as bool value.
did I misunderstand your words?
My bad. I made the markup to the wrong routine which caused this confusion. I've backed this comment out.
I was also picking up the wrong version of util.c:phar_resolve_alias() for my tests and had missed your patch Retruning SUCCESS/FAILURE instead of int, so I was uncovering a bug that you had already fixed. Again sorry for the confusion.
the previous fix made opcache depends on phar... I revert it, will find another way to fix it, maybe fix it in phar side
I know that Phar is the only stream extension with URI mapping that OPcache currently supports, but in principle there could be others in future, so fixing this is OPcache should be as Phar-agnostic as practical.
Zend already includes ~40 hooks to address such load and runtime binding issues, but none of these can be overloaded to do what is needed here. Adding another hook, say zend_resolve_filename, would do the job, but this requires an extra declaration in zend.h. Does this mean that such a solution would need to be deferred to deferred to 5.6 rather than a 5.5.x release? I would appreciate your / Dmitri's guidance on this.
The change is quite straight forward, but we also need to ensure it handles the other URI mapping case, that is the use of Phar mount points. If you can wait a week, I can propose a patch for you based on an additional hook in zend.h. I don't see this as an urgent bug, given that as Dmitri pointed out, it can be coded around in the application on a case-by-case basis.
It seems there is problem with loading Phar files when using zend opcache, i trace it down to path issues.
for example when you use the following script in Mutliple Hosts Environment (MHE), it does not work and the autoload does not find and scripts, and my guess is that the opcache is loading the other websites phar and they have different owner & permmsions so its fails to load them.
but changing the baseDir to
it seems to work alright, and Yes i tried the first script with opcache disabled and it worked
the same problem happens with APC as well.