下文的第一个逐行读取文件例子用三种方式实现;普通方法,迭代器和生成器,比较了他们的优缺点,很好,可以引用到自己的代码中 ,支持的php版本(PHP 5 >= 5.5.0)
后面的yield讲解,得逐行翻译理解
Request for Comments: GeneratorsGenerators PRovide an easy, boilerplate-free way of implementing iterators.
As an example, consider how you would implement the file()
function in userland code:
function getLinesFromFile($fileName) { if (!$fileHandle = fopen($fileName, 'r')) { return; } $lines = []; while (false !== $line = fgets($fileHandle)) { $lines[] = $line; } fclose($fileHandle); return $lines;}$lines = getLinesFromFile($fileName);foreach ($lines as $line) { // do something with $line}
The main disadvantage of this kind of code is evident: It will read the whole file into a large array. Depending on how big the file is, this can easily hit the memory limit. This is not what you usually want. Instead you want to get the lines one by one. This is what iterators are perfect for.
Sadly implementing iterators requires an insane amount of boilerplate code. E.g. consider this iterator variant of the above function:
class LineIterator implements Iterator { protected $fileHandle; protected $line; protected $i; public function __construct($fileName) { if (!$this->fileHandle = fopen($fileName, 'r')) { throw new RuntimeException('Couldn\'t open file "' . $fileName . '"'); } } public function rewind() { fseek($this->fileHandle, 0); $this->line = fgets($this->fileHandle); $this->i = 0; } public function valid() { return false !== $this->line; } public function current() { return $this->line; } public function key() { return $this->i; } public function next() { if (false !== $this->line) { $this->line = fgets($this->fileHandle); $this->i++; } } public function __destruct() { fclose($this->fileHandle); }}$lines = new LineIterator($fileName);foreach ($lines as $line) { // do something with $line}
As you can see a very simple piece of code can easily become very complicated when turned into an iterator. Generators solve this problem and allow you to implement iterators in a very straightforward manner:
function getLinesFromFile($fileName) { if (!$fileHandle = fopen($fileName, 'r')) { return; } while (false !== $line = fgets($fileHandle)) { yield $line; } fclose($fileHandle);}$lines = getLinesFromFile($fileName);foreach ($lines as $line) { // do something with $line}
The code looks very similar to the array-based implementation. The main difference is that instead of pushing values into an array the values are yield
ed.
Generators work by passing control back and forth between the generator and the calling code:
When you first call the generator function ($lines = getLinesFromFile($fileName)
) the passed argument is bound, but nothing of the code is actually executed. Instead the function directly returns a Generator
object. That Generator
object implements the Iterator
interface and is what is eventually traversed by the foreach
loop:
Whenever the Iterator::next()
method is called PHP resumes the execution of the generator function until it hits a yield
expression. The value of that yield
expression is what Iterator::current()
then returns.
Generator methods, together with the IteratorAggregate
interface, can be used to easily implement traversable classes too:
class Test implements IteratorAggregate { protected $data; public function __construct(array $data) { $this->data = $data; } public function getIterator() { foreach ($this->data as $key => $value) { yield $key => $value; } // or whatever other traversation logic the class has }}$test = new Test(['foo' => 'bar', 'bar' => 'foo']);foreach ($test as $k => $v) { echo $k, ' => ', $v, "\n";}
Generators can also be used the other way around, i.e. instead of producing values they can also consume them. When used in this way they are often referred to as enhanced generators, reverse generators or coroutines.
Coroutines are a rather advanced concept, so it very hard to come up with not too contrived an short examples. For an introduction see an example on how to parse streaming xml using coroutines. If you want to know more, I highly recommend checking out a presentation on this subject.
SpecificationRecognition of generator functionsAny function which contains a yield
statement is automatically a generator function.
The initial implementation required that generator functions are marked with an asterix modifier (function*
). This method has the advantage that generators are more explicit and also allows for yield-less coroutines.
The automatic detection was chosen over the asterix modifier for the following reasons:
function *&gen()
if (false) yield;
.When a generator function is called the execution is suspended immediately after parameter binding and a Generator
object is returned.
The Generator
object implements the following interface:
final class Generator implements Iterator { void rewind(); bool valid(); mixed current(); mixed key(); void next(); mixed send(mixed $value); mixed throw(Exception $exception);}
If the generator is not yet at a yield
statement (i.e. was just created and not yet used as an iterator), then any call to rewind
, valid
, current
, key
, next
or send
will resume the generator until the next yield
statement is hit.
Consider this example:
function gen() { echo 'start'; yield 'middle'; echo 'end';}// Initial call does not output anything$gen = gen();// Call to current() resumes the generator, thus "start" is echo'd.// Then the yield expression is hit and the string "middle" is returned// as the result of current() and then echo'd.echo $gen->current();// Execution of the generator is resumed again, thus echoing "end"$gen->next();
A nice side-effect of this behavior is that coroutines do not have to be primed with a next()
call before they can be used. (This is required in Python and also the reason why coroutines in Python usually use some kind of decorator that automatically primes the coroutine.)
Apart from the above the Generator
methods behave as follows:
rewind
: Throws an exception if the generator is currently after the first yield. (More in the “Rewinding a generator” section.)valid
: Returns false
if the generator h