Equivalent C++ to Python generator pattern

Multi tool use
Multi tool use
The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP


Equivalent C++ to Python generator pattern



I've got some example Python code that I need to mimic in C++. I do not require any specific solution (such as co-routine based yield solutions, although they would be acceptable answers as well), I simply need to reproduce the semantics in some manner.



This is a basic sequence generator, clearly too large to store a materialized version.


def pair_sequence():
for i in range(2**32):
for j in range(2**32):
yield (i, j)



The goal is to maintain two instances of the sequence above, and iterate over them in semi-lockstep, but in chunks. In the example below the first_pass uses the sequence of pairs to initialize the buffer, and the second_pass regenerates the same exact sequence and processes the buffer again.


first_pass


second_pass


def run():
seq1 = pair_sequence()
seq2 = pair_sequence()

buffer = [0] * 1000
first_pass(seq1, buffer)
second_pass(seq2, buffer)
... repeat ...



The only thing I can find for a solution in C++ is to mimic yield with C++ coroutines, but I haven't found any good reference on how to do this. I'm also interested in alternative (non general) solutions for this problem. I do not have enough memory budget to keep a copy of the sequence between passes.


yield





As you can see from here stackoverflow.com/questions/3864410/… coroutine is not good idea to implement. Can't you do it with buffered reading? stackoverflow.com/questions/4685862/…
– batbaatar
Jan 30 '12 at 4:11





C++ iterators should support something like this.
– Lalaland
Jan 30 '12 at 5:28




10 Answers
10



Generators exist in C++, just under another name: Input Iterators. For example, reading from std::cin is similar to having a generator of char.


std::cin


char



You simply need to understand what a generator does:



In your trivial example, it's easy enough. Conceptually:


struct State { unsigned i, j; };

State make();

void next(State&);

bool isDone(State const&);



Of course, we wrap this as a proper class:


class PairSequence:
// (implicit aliases)
public std::iterator<
std::input_iterator_tag,
std::pair<unsigned, unsigned>
>
{
// C++03
typedef void (PairSequence::*BoolLike)();
void non_comparable();
public:
// C++11 (explicit aliases)
using iterator_category = std::input_iterator_tag;
using value_type = std::pair<unsigned, unsigned>;
using reference = value_type const&;
using pointer = value_type const*;
using difference_type = ptrdiff_t;

// C++03 (explicit aliases)
typedef std::input_iterator_tag iterator_category;
typedef std::pair<unsigned, unsigned> value_type;
typedef value_type const& reference;
typedef value_type const* pointer;
typedef ptrdiff_t difference_type;

PairSequence(): done(false) {}

// C++11
explicit operator bool() const { return !done; }

// C++03
// Safe Bool idiom
operator BoolLike() const {
return done ? 0 : &PairSequence::non_comparable;
}

reference operator*() const { return ij; }
pointer operator->() const { return &ij; }

PairSequence& operator++() {
static unsigned const Max = std::numeric_limts<unsigned>::max();

assert(!done);

if (ij.second != Max) { ++ij.second; return *this; }
if (ij.first != Max) { ij.second = 0; ++ij.first; return *this; }

done = true;
return *this;
}

PairSequence operator++(int) {
PairSequence const tmp(*this);
++*this;
return tmp;
}

private:
bool done;
value_type ij;
};



So hum yeah... might be that C++ is a tad more verbose :)





I accepted your answer (thanks!) because it is technically correct for the question i gave. Do you have any pointers for techniques in cases where the sequence that needs to be generated is more complex, or am I just beating a dead horse here with C++ and really coroutines are the only way for generality?
– Noah Watkins
Jan 30 '12 at 15:31





@NoahWatkins: coroutines make for an easy implementation when languages support them. Unfortunately C++ does not, so iteration is easier. If you really need coroutines, you actually need a full-blown thread to hold the "stack" of your function call on the side. It's definitely overkill to open such a can of worms just for that in this example, but your mileage may vary depending on your actual needs.
– Matthieu M.
Jan 30 '12 at 15:46







A non-thread-based coroutine implementation is available in Boost boost.org/doc/libs/1_57_0/libs/coroutine/doc/html/index.html with a proposal for standardization here: open-std.org/jtc1/sc22/wg21/docs/papers/2014/n3985.pdf
– boycy
Jan 13 '15 at 17:00







@boycy: There are actually multiple proposal for coroutines, notably one stack-less and the other stack-full. It's tough nut to crack, so for now I am waiting. In the meantime though, stack-less coroutines are implementable near directly as Input Iterators (just, without the sugar).
– Matthieu M.
Jan 13 '15 at 19:11





Yet similar, iterators are not the same as generators.
– Огњен Шобајић
Feb 8 '16 at 4:59



In C++ there are iterators, but implementing an iterator isn't straightforward: one has to consult the iterator concepts and carefully design the new iterator class to implement them. Thankfully, Boost has an iterator_facade template which should help implementing the iterators and iterator-compatible generators.



Sometimes a stackless coroutine can be used to implement an iterator.



P.S. See also this article which mentions both a switch hack by Christopher M. Kohlhoff and Boost.Coroutine by Oliver Kowalke. Oliver Kowalke's work is a followup on Boost.Coroutine by Giovanni P. Deretta.


switch



P.S. I think you can also write a kind of generator with lambdas:


std::function<int()> generator = {
int i = 0;
return [=]() mutable {
return i < 10 ? i++ : -1;
};
}();
int ret = 0; while ((ret = generator()) != -1) std::cout << "generator: " << ret << std::endl;



Or with a functor:


struct generator_t {
int i = 0;
int operator() () {
return i < 10 ? i++ : -1;
}
} generator;
int ret = 0; while ((ret = generator()) != -1) std::cout << "generator: " << ret << std::endl;



P.S. Here's a generator implemented with the Mordor coroutines:


#include <iostream>
using std::cout; using std::endl;
#include <mordor/coroutine.h>
using Mordor::Coroutine; using Mordor::Fiber;

void testMordor() {
Coroutine<int> coro ((Coroutine<int>& self) {
int i = 0; while (i < 9) self.yield (i++);
});
for (int i = coro.call(); coro.state() != Fiber::TERM; i = coro.call()) cout << i << endl;
}



Since Boost.Coroutine2 now supports it very well (I found it because I wanted to solve exactly the same yield problem), I am posting the C++ code that matches your original intention:


yield


#include <stdint.h>
#include <iostream>
#include <memory>
#include <boost/coroutine2/all.hpp>

typedef boost::coroutines2::coroutine<std::pair<uint16_t, uint16_t>> coro_t;

void pair_sequence(coro_t::push_type& yield)
{
uint16_t i = 0;
uint16_t j = 0;
for (;;) {
for (;;) {
yield(std::make_pair(i, j));
if (++j == 0)
break;
}
if (++i == 0)
break;
}
}

int main()
{
coro_t::pull_type seq(boost::coroutines2::fixedsize_stack(),
pair_sequence);
for (auto pair : seq) {
print_pair(pair);
}
//while (seq) {
// print_pair(seq.get());
// seq();
//}
}



In this example, pair_sequence does not take additional arguments. If it needs to, std::bind or a lambda should be used to generate a function object that takes only one argument (of push_type), when it is passed to the coro_t::pull_type constructor.


pair_sequence


std::bind


push_type


coro_t::pull_type





Note that Coroutine2 requires c++11, for which visual studio 2013 is insufficient as it is only partially supported.
– Weston
Jan 10 at 17:49



You should probably check generators in std::experimental in Visual Studio 2015 e.g: https://blogs.msdn.microsoft.com/vcblog/2014/11/12/resumable-functions-in-c/



I think it's exactly what you are looking for. Overall generators should be available in C++17 as this is only experimental Microsoft VC feature.



If you only need to do this for a relatively small number of specific generators, you can implement each as a class, where the member data is equivalent to the local variables of the Python generator function. Then you have a next function that returns the next thing the generator would yield, updating the internal state as it does so.



This is basically similar to how Python generators are implemented, I believe. The major difference being they can remember an offset into the bytecode for the generator function as part of the "internal state", which means the generators can be written as loops containing yields. You would have to instead calculate the next value from the previous. In the case of your pair_sequence, that's pretty trivial. It may not be for complex generators.


pair_sequence



You also need some way of indicating termination. If what you're returning is "pointer-like", and NULL should not be a valid yieldable value you could use a NULL pointer as a termination indicator. Otherwise you need an out-of-band signal.



Something like this is very similar:


struct pair_sequence
{
typedef pair<unsigned int, unsigned int> result_type;
static const unsigned int limit = numeric_limits<unsigned int>::max()

pair_sequence() : i(0), j(0) {}

result_type operator()()
{
result_type r(i, j);
if(j < limit) j++;
else if(i < limit)
{
j = 0;
i++;
}
else throw out_of_range("end of iteration");
}

private:
unsigned int i;
unsigned int j;
}



Using the operator() is only a question of what you want to do with this generator, you could also build it as a stream and make sure it adapts to an istream_iterator, for example.



All answers that involve writing your own iterator are completely wrong. Such answers entirely miss the point of Python generators (one of the language's greatest and unique features). The most important thing about generators is that execution picks up where it left off. This does not happen to iterators. Instead, you must manually store state information such that when operator++ or operator* is called anew, the right information is in place at the very beginning of the next function call. This is why writing your own C++ iterator is a gigantic pain; whereas, generators are elegant, and easy to read+write.



I don't think there is a good analog for Python generators in native C++, at least not yet (there is a rummor that yield will land in C++17). You can get something similarish by resorting to third-party (e.g. Yongwei's Boost suggestion), or rolling your own.



I would say the closest thing in native C++ is threads. A thread can maintain a suspended set of local variables, and can continue execution where it left off, very much like generators, but you need to roll a little bit of additional infrastructure to support communication between the generator object and its caller. E.g.


// Infrastructure

template <typename Element>
class Channel { ... };

// Application

using IntPair = std::pair<int, int>;

void yield_pairs(int end_i, int end_j, Channel<IntPair>* out) {
for (int i = 0; i < end_i; ++i) {
for (int j = 0; j < end_j; ++j) {
out->send(IntPair{i, j}); // "yield"
}
}
out->close();
}

void MyApp() {
Channel<IntPair> pairs;
std::thread generator(yield_pairs, 32, 32, &pairs);
for (IntPair pair : pairs) {
UsePair(pair);
}
generator.join();
}



This solution has several downsides though:



Just as a function simulates the concept of a stack, generators simulate the concept of a queue. The rest is semantics.



As a side note, you can always simulate a queue with a stack by using a stack of operations instead of data. What that practically means is that you can implement a queue-like behavior by returning a pair, the second value of which either has the next function to be called or indicates that we are out of values. But this is more general than what yield vs return does. It allows to simulate a queue of any values rather than homogeneous values that you expect from a generator, but without keeping a full internal queue.



More specifically, since C++ does not have a natural abstraction for a queue, you need to use constructs which implement a queue internally. So the answer which gave the example with iterators is a decent implementation of the concept.



What this practically means is that you can implement something with bare-bones queue functionality if you just want something quick and then consume queue's values just as you would consume values yielded from a generator.



Something like this:



Example use:


using ull = unsigned long long;

auto main() -> int {
for (ull val : range_t<ull>(100)) {
std::cout << val << std::endl;
}

return 0;
}



Will print the numbers from 0 to 99



Using range-v3:


#include <iostream>
#include <tuple>
#include <range/v3/all.hpp>

using namespace std;
using namespace ranges;

auto generator = [x = view::iota(0) | view::take(3)] {
return view::cartesian_product(x, x);
};

int main () {
for (auto x : generator()) {
cout << get<0>(x) << ", " << get<1>(x) << endl;
}

return 0;
}






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

j9Pzwq,YWoJeo9CuMHj0hgVpRkcJiP,eQ ZHbGpuT43zn5K,uvt2lFH5nzS k,rPfLOYkd,ZGPD,9gd
hp,qEAdg,M 60A,0ADgR8G3,FB GXYk3HP 78A9Y,Gx,Tmpc

Popular posts from this blog

Visual Studio Code: How to configure includePath for better IntelliSense results

Spring cloud config client Could not locate PropertySource

Current 93