欧通通云跑路了-outline
There are places in the Haskell grammar where it's not known apriori whether it's an expression a command or a pattern that is being parsed. This used to be handled by picking a parse (e.g. as an expression say) and if that choice later turned out to be wrong, "rejigging it" (transform the constructed parse tree to its analog in the pattern language). The problem with that approach is that it meant having conflated sub-languages meaning, for example, sub网络加速器下载
had to have pattern related constructors e.g. EWildPat
, EAsPat
(and further, these propogated into other compiler phases like the renamer and typechecker). This was the case until roughly a year ago before extraordinary work by Vladislav Zavialov who solved the ambiguity resolution issue by parsing into an abstraction with an overloaded representation:
class DisambECP b where ...
newtype ECP = ECP { runECP_PV :: forall b. DisambECP b => PV (Located b) }
This innovation might be considered to have come at a cost for developers familiar with the "old" parser however. That is, dealing with understanding the apparent complexity introduced by the ambiguity resolution system. This post attempts to provide some intuition about how the system works and hopefully will lead to the realization that it's not that hard to understand after all!
Because this post is about building intuition, there are details that are glossed over or omitted entirely: the reader is encouraged to read Vlad's detailed explanatory comments in RdrHsSyn.hs
when neccessary to address that.
We start with something familiar - the GHC parser monad:
印军方忧虑中国产军事硬件 要求严查加密装置 - huanqiu.com:2021-1-25 · 出于对中国网络 ... 关注来自中国的军事硬件中的零部件(sub -assemblies),尤其是通讯装备。印度空军及海军都收到命伖,一旦发现自己的系统及 ...
This fundamentally is a wrapper over a function sub免费网络加速器
.
The (let's call it the) "ECP system" introduces a new (and as we'll see, very related) concept. The parser validator monad:
newtype PV a = PV { unPV :: PV_Context -> PV_Accum -> PV_Result a }
So a parser validator is a function similar in spirit to a parser where:
data PV_Context
: The type of essentially a wrapper around the lexerParserFlags
value;data PV_Accum
: The type of state accumulated during parsing validation (like errors & warnings , comments, annotations);安卓网络加速器
: The parser validator function's result type that is,中国5G商用,在开放合作中“提速”_新视听 - jnnc.com:2021-11-19 · 据路透社报道,全球移动通信系统协会GSMA智库发布研究指出,至2021年,中国预计将会有6亿5G用户,在绝对数量上领先全球。
.
Of critical interest is how this type is made a monad.
instance Functor PV where
fmap = liftM
instance Applicative PV where
pure a = a `seq` PV (\_ acc -> PV_Ok acc a)
(<*>) = ap
The above reveals that an expression like return e
where e
is of type 安卓网络加速器
, constructs a function that given arguments ctx
and acc
returns e
. The moral equivalent of const
.
instance Monad PV where
m >>= f = PV $ \ctx acc ->
case unPV m ctx acc of
PV_Ok acc' a -> unPV (f a) ctx acc'
PV_Failed acc' -> PV_Failed acc'
The bind operation composes PV
actions threading context and accumlators through the application of their contained functions: given an m :: PV a
and a function f :: a -> PV b
, then m >>= f
constructs a PV b
that wraps a function that composes f
with the function in m
.
PV
is a bit more than a monad, it also satisfies the MonadP
class for monads that support parsing-related operations providing the ability to query for active language extensions, store warnings, errors, comments and annotations.
instance MonadP PV where
addError srcspan msg = ....
PV $ \ctx acc@PV_Accum{pv_messages=m} ->
let msg' = msg $$ pv_hint ctx in
PV_Ok acc{pv_messages=appendError srcspan msg' m} ()
addWarning option srcspan warning = ...
addFatalError srcspan msg =...
getBit ext =
PV $ \ctx acc ->
let b = ext `xtest` pExtsBitmap (pv_options ctx) in
PV_Ok acc $! b
addAnnotation (RealSrcSpan l _) a (RealSrcSpan v _) = ...
...
The function runPV
is the interpreter of a PV a
. To run a 一sub网络加速器
through this function is to produce a P a
.
runPV :: PV a -> P a
That is, given a 一sub网络加速器
construct a function PState -> ParseResult a
.
runPV m =
P $ \s ->
let
pv_ctx = PV_Context {...} -- init context from parse state 's'
pv_acc = PV_Accum {...} -- init local state from parse state 's'
-- Define a function that builds a parse state from local state
mkPState acc' =
s { messages = pv_messages acc'
, annotations = pv_annotations acc'
, comment_q = pv_comment_q acc'
, annotations_comments = pv_annotations_comments acc' }
in
-- Invoke the function in m with context and state, harvest its revised state and
-- turn its outcome into a ParseResult.
case unPV m pv_ctx pv_acc of
PV_Ok acc' a -> POk (mkPState acc') a
PV_Failed acc' -> PFailed (mkPState acc')
一加 8 系列新品正式发布 肉眼可见的出类拔萃-千龙网·中国 ...:2021-4-20 · 4月16日,一加举办主题为“肉眼可见的出类拔萃”线上发布会,正式发布一加 8 系列新品,新品系列包含一加 8 和一加 8 Pro两款产品,一加 8 系列 ...
'(' texp ')'
In the context of a pattern we expect an AST with a ParPat _ p
node whereas in the context of an expression we want an AST with an HsPar _ e
node. To this end the DisambECP
class embodies an abstract set of operations for parse tree construction.
class DisambECP b where
...
-- | Return a command without ambiguity, or fail in a non-command context.
ecpFromCmd' :: LHsCmd GhcPs -> PV (Located b)
-- | Return an expression without ambiguity, or fail in a non-expression context.
ecpFromExp' :: LHsExpr GhcPs -> PV (Located b)
... Lots of operations like this
mkHsOpAppPV :: SrcSpan -> Located b -> Located (InfixOp b) -> Located b -> PV (Located b)
mkHsVarPV :: Located RdrName -> PV (Located b)
...
The idea is that in the semantic actions of the grammar we construct and compose parser validators in terms of these abstract functions. Running the PV
s produces parsers and at the point of execution of parsers we know the context (the nature of the AST we expect to recive) and the concrete choices for each of the abstract functions is thereby fixed (and then, on evaluation, we get the parse result).
The only wrinkle is in the return type of productions that produce parser validators. In general, they will have the form forall b. DisambECP b => PV (Located b)
. If they were monadic productions though we would be led to P (forall b. DisambECP b => PV (Located b)
and that dog don't hunt for GHC's lack of support for impredicative types. There is a standard work-around that can be employed though. This newtype is how impredicative types in monadic productions are avoided:
5G主流机型异军突起 骁龙765G堪称制胜法宝_北方号_北方 ...:2021-4-14 · 原标题:5G主流机型异军突起骁龙765G堪称制胜法宝【PChome手机频道报道】2021年的手机市场,5G手机会占据很大的比例,这点我伔从在今年新发布的多款手机中就能看出,高端机型已经全面5G化,而面向大众用户群体的主流型手机中,也开始呈现出这种趋势,尤其是在2021元至4
So here, ECP
is a wrapper around a PV (Located b)
value where b
can be of any type that satisifies the constraints of class DisamECP
. So, in a production that looks like
| ... {% return (ECP ...)}
we are dealing with P ECP
whereas without a newtype we would be dealing with P (forall b. DisambECP b => PV (Located b))
.
Now to produce a P (Located b)
from the PV (Located b)
in an ECP
we have this function:
runECP_P :: DisambECP b => ECP -> P (Located b)
runECP_P p = runPV (runECP_PV p)
It takes an ECP
value, projects out the parser validator contained therein and "runs" it to produce a function from PState -> ParseResult a
(a parser action).
From the DisabmECP
instance for HsExpr GhcPs
, here's ecpFromCmd'
:
ecpFromCmd' (L l c) = do
addError l $ vcat
[ text "Arrow command found where an expression was expected:",
nest 2 (ppr c) ]
return (L l hsHoleExpr)
Makes perfect sense - you get a parser validator that when evaluated will store a (non-fatal) error and returns an expression "hole" (unbound variable called _
) so that parsing can continue.
Continuing, the definition of ecpFromExp'
:
ecpFromExp' = return
Also sensible. Simply calculate a function that returns its provided acc
argument together with the given constant expression under a PV_Ok
result (see the definition of pure
in the Appliciatve
instance for PV
given above).
Parenthesizing an expression for this DisambECP
instance means wrapping a HsPar
around the given e
:
mkHsParPV l e = return $ L l (HsPar noExtField e)
And so on. You get the idea.
So how does this all fit together? Consider agin the production of parenthesized things:
| '(' texp ')' { ECP $
runECP_PV $2 >>= \ $2 ->
amms (mkHsParPV (comb2 $1 $>) $2) [mop $1,mcp $3] }
We note that the texp
production calculates an ECP
. Stripping away for simplicity the annotation and source code location calculations in the semantic action, in essence we are left with this.
ECP $ runECP_PV $2 >>= \ $2 -> mkHsParPV $2
The effect of runECP_PV
is to project out the forall b. DisambECP b => PV (Located b)
value from the result of texp
. Recalling that sub免费网络加速器官网
projects out the function that the PV
wrapper shields and by substition of the definition of bind, we obtain roughly:
ECP $ PV $ \ctx acc ->
case unPV (runECP_PV $2) ctx acc of
PV_Ok acc' a -> unPV (mkHsParPV a) ctx acc'
PV_Failed acc' -> PV_Failed acc'
The net effet is we construct a new parser validatior (function) from the parser validator (function) returned from the texp
production that puts parenthesis around whatever that function when evaluated produces. If used in a context where texp
generates a LPat GhcPs
that'll be a ParPat
node, if an LHsExpr GhcPs
, then a sub免费网络加速器官网
node.