|  | Home | Libraries | People | FAQ | More | 
        Backtracking in Spirit.Qi requires the use of the following
        types of iterator: forward, bidirectional, or random access. Because of backtracking,
        input iterators cannot be used. Therefore, the standard library classes
        std::istreambuf_iterator and std::istream_iterator,
        that fall under the category of input iterators, cannot be used. Another
        input iterator that is of interest is one that wraps a lexer, such as LEX.
      
| ![[Note]](../../images/note.png) | Note | 
|---|---|
| 
          In general, Spirit.Qi generates recursive descent
          parser which require backtracking parsers by design. For this reason we
          need to provide at least forward iterators to any of Spirit.Qi's
          API functions. This is not an absolute requirement though. In the future,
          we shall see more deterministic parsers that require no more than 1 character
          (token) of lookahead. Such parsers allow us to use input iterators such
          as the  | 
        Backtracking can be implemented only if we are allowed to save an iterator
        position, i.e. making a copy of the current iterator. Unfortunately, with
        an input iterator, there is no way to do so, and thus input iterators will
        not work with backtracking in Spirit.Qi. One solution
        to this problem is to simply load all the data to be parsed into a container,
        such as a vector or deque, and then pass the begin and end of the container
        to Spirit.Qi. This method can be too memory intensive
        for certain applications, which is why the multi_pass
        iterator was created.
      
        The multi_pass iterator will
        convert any input iterator into a forward iterator suitable for use with
        Spirit.Qi. multi_pass
        will buffer data when needed and will discard the buffer when its contents
        is not needed anymore. This happens either if only one copy of the iterator
        exists or if no backtracking can occur.
      
        A grammar must be designed with care if the multi_pass
        iterator is used. Any rule that may need to backtrack, such as one that contains
        an alternative, will cause data to be buffered. The rules that are optimal
        to use are repetition constructs (as kleene and plus).
      
        Sequences of the form a >> b
        will buffer data as well. This is different from the behaviour of Spirit.Classic
        but for a good reason. Sequences need to reset the current iterator to its
        initial state if one of the components of a seqeunce fails to match. To compensate
        for this behaviour we added functionality to the expect
        parsers (i.e. constructs like a
        > b).
        Expectation points introduce deterministic points into the grammar ensuring
        no backtracking can occur if they match. For this reason we clear the buffers
        of any multi_pass iterator on each expectation point, ensuring minimal buffer
        content even for large grammars.
      
| ![[Important]](../../images/important.png) | Important | 
|---|---|
| 
          If you use an error handler in conjunction with the  rule r<iterator_type> r; on_error<retry>(r, std::cout << phoenix::val("Error!")); If you fail to do so the resulting code will trigger an assert statement at runtime. | 
        Any rule that repeats, such as kleene_star (*a) or positive such as (+a), will only buffer the data for the current
        repetition.
      
        In typical grammars, ambiguity and therefore lookahead is often localized.
        In fact, many well designed languages are fully deterministic and require
        no lookahead at all. Peeking at the first character from the input will immediately
        determine the alternative branch to take. Yet, even with highly ambiguous
        grammars, alternatives are often of the form *(a | b
        | c | d).
        The input iterator moves on and is never stuck at the beginning. Let's look
        at a Pascal snippet for example:
      
program = programHeading >> block >> '.' ; block = *( labelDeclarationPart | constantDefinitionPart | typeDefinitionPart | variableDeclarationPart | procedureAndFunctionDeclarationPart ) >> statementPart ;
Notice the alternatives inside the Kleene star in the rule block . The rule gobbles the input in a linear manner and throws away the past history with each iteration. As this is fully deterministic LL(1) grammar, each failed alternative only has to peek 1 character (token). The alternative that consumes more than 1 character (token) is definitely a winner. After which, the Kleene star moves on to the next.
        Now, after the lecture on the features to be careful with when using multi_pass, you may think that multi_pass is way too restrictive to use.
        That's not the case. If your grammar is deterministic, you can make use of
        the flush_multi_pass pseudo
        parser in your grammar to ensure that data is not buffered when unnecessary
        (flush_multi_pass is available
        from the Spirit.Qi parser Repository).
      
        Here we present a minimal example showing a minimal use case. The multi_pass iterator is highly configurable,
        but the default policies have been chosen so that its easily usable with
        input iterators such as std::istreambuf_iterator.
        For the complete source code of this example please refer to multi_pass.cpp.
      
int main() { namespace spirit = boost::spirit; using spirit::ascii::space; using spirit::ascii::char_; using spirit::qi::double_; using spirit::qi::eol; std::ifstream in("multi_pass.txt"); // we get our input from this file if (!in.is_open()) { std::cout << "Could not open input file: 'multi_pass.txt'" << std::endl; return -1; } typedef std::istreambuf_iterator<char> base_iterator_type; spirit::multi_pass<base_iterator_type> first = spirit::make_default_multi_pass(base_iterator_type(in)); std::vector<double> v; bool result = spirit::qi::phrase_parse(first , spirit::make_default_multi_pass(base_iterator_type()) , double_ >> *(',' >> double_) // recognize list of doubles , space | '#' >> *(char_ - eol) >> eol // comment skipper , v); // data read from file if (!result) { std::cout << "Failed parsing input file!" << std::endl; return -2; } std::cout << "Successfully parsed input file!" << std::endl; return 0; }
        The Spirit Repository
        contains the flush_multi_pass
        parser component. This is usable in conjunction with the multi_pass
        iterator to minimize the buffering. It allows to insert explicit synchronization
        points into your grammar where it is safe to clear any stored input as it
        is ensured that no backtracking can occur at this point anymore.
      
        When the flush_multi_pass
        parser is used with multi_pass,
        it will call multi_pass::clear_queue().
        This will cause any buffered data to be erased. This also will invalidate
        all other copies of multi_pass and they should not be used. If they are,
        an boost::illegal_backtracking exception will be
        thrown.
      
        The multi_pass iterator is
        a templated class configurable using policies. The description of multi_pass above is how it was originally
        implemented (before it used policies), and is the default configuration now.
        But, multi_pass is capable
        of much more. Because of the open-ended nature of policies, you can write
        your own policy to make multi_pass
        behave in a way that we never before imagined.
      
The multi_pass class has two template parameters:
The multi_pass template parameters
The type multi_pass uses to acquire it's input. This is typically an input iterator, or functor.
The combined policies to use to create an instance of a multi_pass iterator. This combined policy type is described below
        It is possible to implement all of the required functionality of the combinded
        policy in a single class. But it has shown to be more convenient to split
        this into four different groups of functions, i.e. four separate, but well
        coordinated policies. For this reason the multi_pass
        library implements a template iterator_policies::default_policy
        allowing to combine several different policies, each implementing one of
        the functionality groups:
      
Table 12. Policies needed for default_policy template
| Template Parameter | Description | 
|---|---|
| 
                   | 
                  This policy determines how  | 
| 
                   | This policy determines how checking for invalid iterators is done. | 
| 
                   | 
                  A class that defines how  | 
| 
                   | 
                  The buffering scheme used by  | 
        The multi_pass library contains
        several predefined policy implementations for each of the policy types as
        described above. First we will describe those predefined types. Afterwards
        we will give some guidelines how you can write your own policy implementations.
      
        All predefined multi_pass
        policies are defined in the namespace boost::spirit::iterator_policies.
      
Table 13. Predefined policy classes
| Class name | Description | 
|---|---|
| InputPolicy classes | |
| 
                   | 
                  This policy directs  | 
| 
                   | 
                  This policy directs  | 
| 
                   | 
                  This policy directs  | 
| 
                   | This policy obtains it's input by calling yylex(), which would typically be provided by a scanner generated by Flex. If you use this policy your code must link against a Flex generated scanner. | 
| 
                   | 
                  This input policy obtains it's data by calling a functor of type
                   | 
| 
                   | 
                  This is essentially the same as the  | 
| OwnershipPolicy classes | |
| 
                   | 
                  This class uses a reference counting scheme. The  | 
| 
                   | 
                  When this policy is used, the first  | 
| CheckingPolicy classes | |
| 
                   | This policy does no checking at all. | 
| 
                   | 
                  This policy keeps around a buffer id, or a buffer age. Every time
                   | 
| full_check | This policy has not been implemented yet. When it is, it will keep track of all iterators and make sure that they are all valid. This will be mostly useful for debugging purposes as it will incur significant overhead. | 
| StoragePolicy classes | |
| 
                   | 
                  Despite its name this policy keeps all buffered data in a  | 
| 
                   | 
                  This policy keeps a circular buffer that is size  | 
        The beauty of policy based designs is that you can mix and match policies
        to create your own custom iterator by selecting the policies you want. Here's
        an example of how to specify a custom multi_pass
        that wraps an std::istream_iterator<char>,
        and is slightly more efficient than the default multi_pass
        (as generated by the make_default_multi_pass() API function) because it uses the iterator_policies::first_owner OwnershipPolicy and the iterator_policies::no_check CheckingPolicy:
      
typedef multi_pass< std::istream_iterator<char> , iterator_policies::default_policy< iterator_policies::first_owner , iterator_policies::no_check , iterator_policies::buffering_input_iterator , iterator_policies::split_std_deque > > first_owner_multi_pass_type;
        The default template parameters for iterator_policies::default_policy
        are:
      
iterator_policies::ref_counted OwnershipPolicy
        iterator_policies::no_check CheckingPolicy, if BOOST_SPIRIT_DEBUG is defined: iterator_policies::buf_id_check CheckingPolicy
        iterator_policies::buffering_input_iterator InputPolicy,
          and
        iterator_policies::split_std_deque StoragePolicy.
        
        So if you use multi_pass<std::istream_iterator<char>
        > you will get those pre-defined
        behaviors while wrapping an std::istream_iterator<char>.
      
        There is one other pre-defined class called look_ahead.
        The class look_ahead is another
        predefine multi_pass iterator
        type. It has two template parameters: Input,
        the type of the input iterator to wrap, and a std::size_t N, which specifies the size of the buffer
        to the fixed_size_queue policy.
        While the default multi_pass configuration is designed for safey, look_ahead is designed for speed. look_ahead is derived from a multi_pass
        with the following policies: input_iterator
        InputPolicy, first_owner
        OwnershipPolicy, no_check
        CheckingPolicy, and fixed_size_queue<N>
        StoragePolicy.
      
This iterator is defined by including the files:
// forwards to <boost/spirit/home/support/look_ahead.hpp> #include <boost/spirit/include/support_look_ahead.hpp>
Also, see Include Structure.
        Yet another predefined iterator for wrapping standard input streams (usually
        a std::basic_istream<>)
        is called basic_istream_iterator<Char, Traits>. This class is usable as a drop in replacement
        for std::istream_iterator<Char, Traits>.
        Its only difference is that it is a forward iterator (instead of the std::istream_iterator,
        which is an input iterator). basic_istream_iterator
        is derived from a multi_pass with the following policies: istream
        InputPolicy, ref_counted
        OwnershipPolicy, no_check
        CheckingPolicy, and split_std_deque
        StoragePolicy.
      
There exists an additional predefined typedef:
typedef basic_istream_iterator<char, std::char_traits<char> > istream_iterator;
This iterator is defined by including the files:
// forwards to <boost/spirit/home/support/istream_iterator.hpp> #include <boost/spirit/include/support_istream_iterator.hpp>
Also, see Include Structure.
functor_input
        InputPolicy
      
        If you want to use the functor_input
        InputPolicy, you can write your own function object that will supply the
        input to multi_pass. The
        function object must satisfy several requirements. It must have a typedef
        result_type which specifies
        the return type of its operator(). This is standard practice in the STL.
        Also, it must supply a static variable called eof which is compared against
        to know whether the input has reached the end. Last but not least the function
        object must be default constructible. Here is an example:
      
// define the function object class iterate_a2m { public: typedef char result_type; iterate_a2m() : c('A') {} iterate_a2m(char c) : c_(c) {} result_type operator()() const { if (c_ == 'M') return eof; return c_++; } static result_type eof; private: char c_; }; iterate_a2m::result_type iterate_a2m::eof = iterate_a2m::result_type('\0'); // create two iterators using the define function object, one of which is // an end iterator typedef multi_pass<iterate_a2m , iterator_policies::functor_input , iterator_policies::first_owner , iterator_policies::no_check , iterator_policies::split_std_deque> functor_multi_pass_type; functor_multi_pass_type first = functor_multi_pass_t(iterate_a2m()); functor_multi_pass_type last; // use the iterators: this will print "ABCDEFGHIJKL" while (first != last) { std::cout << *first; ++first; }
        All policies to be used with the default_policy
        template need to have two embedded classes: unique
        and shared. The unique class needs to implement all required
        functions for a particular policy type. In addition it may hold all member
        data items being unique for a particular instance of
        a multi_pass (hence the name).
        The shared class does not
        expose any member functions (except sometimes a constructor), but it may
        hold all member data items to be shared between all
        copies of a particular multi_pass.
      
        An InputPolicy must have
        the following interface:
      
struct input_policy { // Input is the same type used as the first template parameter // while instantiating the multi_pass template <typename Input> struct unique { // these typedef's will be exposed as the multi_pass iterator // properties typedef __unspecified_type__ value_type; typedef __unspecified_type__ difference_type; typedef __unspecified_type__ distance_type; typedef __unspecified_type__ pointer; typedef __unspecified_type__ reference; unique() {} explicit unique(Input) {} // destroy is called whenever the last copy of a multi_pass is // destructed (ownership_policy::release() returned true) // // mp: is a reference to the whole multi_pass instance template <typename MultiPass> static void destroy(MultiPass& mp); // swap is called by multi_pass::swap() void swap(unique&); // get_input is called whenever the next input character/token // should be fetched. // // mp: is a reference to the whole multi_pass instance // // This method is expected to return a refernce to the next // character/token template <typename MultiPass> static typename MultiPass::reference get_input(MultiPass& mp); // advance_input is called whenever the underlying input stream // should be advanced so that the next call to get_input will be // able to return the next input character/token // // mp: is a reference to the whole multi_pass instance template <typename MultiPass> static void advance_input(MultiPass& mp); // input_at_eof is called to test whether this instance is a // end of input iterator. // // mp: is a reference to the whole multi_pass instance // // This method is expected to return true if the end of input is // reached. It is often used in the implementation of the function // storage_policy::is_eof. template <typename MultiPass> static bool input_at_eof(MultiPass const& mp); // input_is_valid is called to verify if the parameter t represents // a valid input character/token // // mp: is a reference to the whole multi_pass instance // t: is the character/token to test for validity // // This method is expected to return true if the parameter t // represents a valid character/token. template <typename MultiPass> static bool input_is_valid(MultiPass const& mp, value_type const& t); }; // Input is the same type used as the first template parameter passed // while instantiating the multi_pass template <typename Input> struct shared { explicit shared(Input) {} }; };
        It is possible to derive the struct unique
        from the type boost::spirit::detail::default_input_policy. This type implements
        a minimal sufficient interface for some of the required functions, simplifying
        the task of writing a new input policy.
      
        This class may implement a function destroy() being called during destruction of the
        last copy of a multi_pass.
        This function should be used to free any of the shared data items the policy
        might have allocated during construction of its shared
        part. Because of the way multi_pass
        is implemented any allocated data members in shared
        should not be deep copied in a copy
        constructor of shared.
      
        The OwnershipPolicy must
        have the following interface:
      
struct ownership_policy { struct unique { // destroy is called whenever the last copy of a multi_pass is // destructed (ownership_policy::release() returned true) // // mp: is a reference to the whole multi_pass instance template <typename MultiPass> static void destroy(MultiPass& mp); // swap is called by multi_pass::swap() void swap(unique&); // clone is called whenever a multi_pass is copied // // mp: is a reference to the whole multi_pass instance template <typename MultiPass> static void clone(MultiPass& mp); // release is called whenever a multi_pass is destroyed // // mp: is a reference to the whole multi_pass instance // // The method is expected to return true if the destructed // instance is the last copy of a particular multi_pass. template <typename MultiPass> static bool release(MultiPass& mp); // is_unique is called to test whether this instance is the only // existing copy of a particular multi_pass // // mp: is a reference to the whole multi_pass instance // // The method is expected to return true if this instance is unique // (no other copies of this multi_pass exist). template <typename MultiPass> static bool is_unique(MultiPass const& mp); }; struct shared {}; };
        It is possible to derive the struct unique
        from the type boost::spirit::detail::default_ownership_policy. This type implements
        a minimal sufficient interface for some of the required functions, simplifying
        the task of writing a new ownership policy.
      
        This class may implement a function destroy() being called during destruction of the
        last copy of a multi_pass.
        This function should be used to free any of the shared data items the policy
        might have allocated during construction of its shared
        part. Because of the way multi_pass
        is implemented any allocated data members in shared
        should not be deep copied in a copy
        constructor of shared.
      
        The CheckingPolicy must have
        the following interface:
      
struct checking_policy { struct unique { // swap is called by multi_pass::swap() void swap(unique&); // destroy is called whenever the last copy of a multi_pass is // destructed (ownership_policy::release() returned true) // // mp: is a reference to the whole multi_pass instance template <typename MultiPass> static void destroy(MultiPass& mp); // docheck is called before the multi_pass is dereferenced or // incremented. // // mp: is a reference to the whole multi_pass instance // // This method is expected to make sure the multi_pass instance is // still valid. If it is invalid an exception should be thrown. template <typename MultiPass> static void docheck(MultiPass const& mp); // clear_queue is called whenever the function // multi_pass::clear_queue is called on this instance // // mp: is a reference to the whole multi_pass instance template <typename MultiPass> static void clear_queue(MultiPass& mp); }; struct shared {}; };
        It is possible to derive the struct unique
        from the type boost::spirit::detail::default_checking_policy. This type implements
        a minimal sufficient interface for some of the required functions, simplifying
        the task of writing a new checking policy.
      
        This class may implement a function destroy() being called during destruction of the
        last copy of a multi_pass.
        This function should be used to free any of the shared data items the policy
        might have allocated during construction of its shared
        part. Because of the way multi_pass
        is implemented any allocated data members in shared
        should not be deep copied in a copy
        constructor of shared.
      
        A StoragePolicy must have
        the following interface:
      
struct storage_policy { // Value is the same type as typename MultiPass::value_type template <typename Value> struct unique { // destroy is called whenever the last copy of a multi_pass is // destructed (ownership_policy::release() returned true) // // mp: is a reference to the whole multi_pass instance template <typename MultiPass> static void destroy(MultiPass& mp); // swap is called by multi_pass::swap() void swap(unique&); // dereference is called whenever multi_pass::operator*() is invoked // // mp: is a reference to the whole multi_pass instance // // This function is expected to return a reference to the current // character/token. template <typename MultiPass> static typename MultiPass::reference dereference(MultiPass const& mp); // increment is called whenever multi_pass::operator++ is invoked // // mp: is a reference to the whole multi_pass instance template <typename MultiPass> static void increment(MultiPass& mp); // // mp: is a reference to the whole multi_pass instance template <typename MultiPass> static void clear_queue(MultiPass& mp); // is_eof is called to test whether this instance is a end of input // iterator. // // mp: is a reference to the whole multi_pass instance // // This method is expected to return true if the end of input is // reached. template <typename MultiPass> static bool is_eof(MultiPass const& mp); // less_than is called whenever multi_pass::operator==() is invoked // // mp: is a reference to the whole multi_pass instance // rhs: is the multi_pass reference this instance is compared // to // // This function is expected to return true if the current instance // is eual to the right hand side multi_pass instance template <typename MultiPass> static bool equal_to(MultiPass const& mp, MultiPass const& rhs); // less_than is called whenever multi_pass::operator<() is invoked // // mp: is a reference to the whole multi_pass instance // rhs: is the multi_pass reference this instance is compared // to // // This function is expected to return true if the current instance // is less than the right hand side multi_pass instance template <typename MultiPass> static bool less_than(MultiPass const& mp, MultiPass const& rhs); }; // Value is the same type as typename MultiPass::value_type template <typename Value> struct shared {}; };
        It is possible to derive the struct unique
        from the type boost::spirit::detail::default_storage_policy. This type implements
        a minimal sufficient interface for some of the required functions, simplifying
        the task of writing a new storage policy.
      
        This class may implement a function destroy() being called during destruction of the
        last copy of a multi_pass.
        This function should be used to free any of the shared data items the policy
        might have allocated during construction of its shared
        part. Because of the way multi_pass
        is implemented any allocated data members in shared
        should not be deep copied in a copy
        constructor of shared.
      
        Generally, a StoragePolicy
        is the trickiest policy to implement. You should study and understand the
        existing StoragePolicy classes
        before you try and write your own.