@@ -3,9 +3,9 @@ Our translation covers entirely the syntax of the Coq language and it preserves

Moreover, we are confident that, if and when future work yields a complete formalization of the Coq typing rules in an LF-like logical framework, our translation will be in a format suitable for rechecking the entire library --- with the obvious caveat that such a rechecking would face even more serious scalability issues than we had to overcome so far.

\CSC{An obvious way to verify that the exported information is sound and complete for type-checking would be to implement an importer for Coq itself or, in alternative, for an independent verifier for the logic of Coq, like the one implemented in Dedukti or the one developed in the HELM/MoWGLI projects~\cite{DBLP:journals/amai/AspertiPCGS03} and later incorporated into Matita 0.5.x series~\cite{DBLP:conf/cade/AspertiRCT11}. In both cases one would need to develop a translation from MMT theories to the modular constructs of the language, which requires more research. For example, no translation of MMT theories and theory morphisms into modules, module types and functors is currently known.}

An obvious way to verify that the exported information is sound and complete for type-checking would be to implement an importer for Coq itself or, in alternative, for an independent verifier for the logic of Coq, like the one implemented in Dedukti or the one developed in the HELM/MoWGLI projects~\cite{DBLP:journals/amai/AspertiPCGS03} and later incorporated into Matita 0.5.x series~\cite{DBLP:conf/cade/AspertiRCT11}. In both cases one would need to develop a translation from MMT theories to the modular constructs of the language, which requires more research. For example, no translation of MMT theories and theory morphisms into modules, module types and functors is currently known.

\CSC{We would also like to stress that independent verification is not the aim of our effort: the main point of exporting the library of Coq to MMT is to allow independent services over them, like queries, discovery of alignments with libraries of other tools or training machine learning advisers that can drive hammers. Most of these services can be implemented even if the typing information is incomplete or even unsound (e.g. if all unvierses are squashed to a single universe, making the logic inconsistent).}

We would also like to stress that independent verification is not the aim of our effort: the main point of exporting the library of Coq to MMT is to allow independent services over them, like queries, discovery of alignments with libraries of other tools or training machine learning advisers that can drive hammers. Most of these services can be implemented even if the typing information is incomplete or even unsound (e.g. if all unvierses are squashed to a single universe, making the logic inconsistent).

\paragraph{Limitations and Future Work}

Our translation starts with the Coq kernel data structures and is thus inherently limited to the structure seen by the kernel.

...

...

@@ -13,9 +13,9 @@ Therefore, record types and type classes are presented just as inductive types,

This is unfortunate as recent Coq developments, most importantly the Mathematical Components project \cite{gonthier_packaging}, have made heavy use of records to represent theory graph--like structuring and an unelaborated representation would be more informative to the user and to reasoning tools.

In fact, even sections are not visible to the kernel, and we were able to include them because we were able to reconstruct the section structure during our translation.

We expect that similar efforts may allow for including record types and canonical structures in the theory graph in the future \CSC{and we plan to start working on that next.}

We expect that similar efforts may allow for including record types and canonical structures in the theory graph in the future and we plan to start working on that next.

\CSC{Many libraries avoid module and functors and achieve modularity using other more recent features of Coq that are invisible at the kernel level, like type classes. Moreover type classes, canonical structures, coercions, etc. are necessary information to extend a library because they explain how the various mathematical notions are meant to be used. While the already cited services that we plan to provide do not depend on them, importing the library in another system to build on top of it surely does. Therefore a future challenge will be to find system independent generalizations and representations of such constructs, which will be necessary to incorporate them into a logic and system independent tool like MMT.}

Many libraries avoid module and functors and achieve modularity using other more recent features of Coq that are invisible at the kernel level, like type classes. Moreover type classes, canonical structures, coercions, etc. are necessary information to extend a library because they explain how the various mathematical notions are meant to be used. While the already cited services that we plan to provide do not depend on them, importing the library in another system to build on top of it surely does. Therefore a future challenge will be to find system independent generalizations and representations of such constructs, which will be necessary to incorporate them into a logic and system independent tool like MMT.

Our formal representation of Coq declarations includes the types of all constants and variables, but we use a single type in the logical framework for all Coq expressions.

As we explain in Sect.~\ref{sec:logic}, we consider a typed representation of expressions infeasible at this point.

We give only an extremely dense explanation of the Coq language and refer to Appendix~\ref{app:coq} and \cite{coq} for details.

We give only an extremely dense explanation of the Coq language and refer to Appendix A\footnote{Available online at \url{https://kwarc.info/people/dmueller/pubs/CoqImport.pdf}} and \cite{coq} for details.

We use a \textbf{grammar} for the \emph{abstract} syntax seen by the Coq kernel (Fig.~\ref{fig:coqgrammar}) because that is the one that our translation works with.

Even though we do not have space to describe all aspects of the translation in detail, we give an almost entire grammar here in order to document most of the language features we translate. A slightly more comprehensive grammar is presented in the companion paper~\cite{coqtoxml_csc}: the omitted features do not pose additional problems to the translation to MMT and are omitted to simplify the presentation.

@@ -42,7 +42,7 @@ This is inconsequential except that we have to add corresponding include declara

Fortunately, this information is anyway stored by Coq so that this is no problem.

In the sequel, we write $\ov{i}$ for the \mmt translation of the Coq item $i$ except that, if $i$ is a Coq identifier, we write $i$ in \mmt as well if no confusion is possible.

We omit the translation of base logic expressions and refer to Appendix~\ref{app:sections} for the translation of sections.

We omit the translation of base logic expressions and refer to Appendix C\footnote{Available online at \url{https://kwarc.info/people/dmueller/pubs/CoqImport.pdf}} for the translation of sections.

%The borders between namespace, theory, and constant names are not inherent a Coq identifier.

%But it can be inferred from the xml files: the existence of a \texttt{theory.xml} marks a namespace, and the distinction between theory and constant can be made based on the XML tags.

...

...

@@ -193,11 +193,6 @@ More formally:

However, considering that the Coq library is already well-typed and that the covariant translation is so much simpler, that is sufficient for many practical applications.

\paragraph{Applying Functors}

Coq functor application may be partial and curried.

Thus, it is sufficient to restrict attention to $r=1$.

So consider a module type declaration $\ModTyp t(x:T):=B$ and a module $m:T$.

We have to define the translation of the module type $t(m)$, whose semantics is determined in Coq by substituting $m$ for $x$ in $B$.

\begin{wrapfigure}{r}{4cm}

\vspace{-3em}

\begin{tikzpicture}

...

...

@@ -216,7 +211,10 @@ We have to define the translation of the module type $t(m)$, whose semantics is

\end{tikzpicture}

\vspace{-2em}

\end{wrapfigure}

Coq functor application may be partial and curried.

Thus, it is sufficient to restrict attention to $r=1$.

So consider a module type declaration $\ModTyp t(x:T):=B$ and a module $m:T$.

We have to define the translation of the module type $t(m)$, whose semantics is determined in Coq by substituting $m$ for $x$ in $B$.

We want the translation to be compositional, i.e., defined in terms of the theory $\ov{t}$ arising from the translation of $t$ and the morphism $\ov{m}^*:\ov{t}\to\ov{m}$ arising from the translation of $m$, as in the diagram on the right.

As defined above, the functor $t$ is translated to a theory $\ov{t}$ with a nested theory $\ov{t/x}$ that conforms to $\ov{T}$ as well as an include of $\ov{t/x}$.

Let $p$ be the Coq file or module (type) in which $t(m)$ is well-typed; thus, $\ov{p}$ is a theory that includes $\ov{m}$.

We define an \mmt theory \Coq that defines the base logic of Coq.

This theory will occur as the meta-theory of all \mmt theories generated from files in the Coq library (except when flags are used, see below).

The theory \Coq is available at \url{https://gl.mathhub.info/Coq/foundation/blob/master/source/Coq.mmt}.

We briefly describe it in the sequel and refer to Appendix~\ref{app:logic} for details.

We briefly describe it in the sequel and refer to Appendix B\footnote{Available online at \url{https://kwarc.info/people/dmueller/pubs/CoqImport.pdf}} for details.

\paragraph{Expressions}

Due to lack of space, we only present the encoding of the PTS fragment of Coq, i.e. we omit \texttt{let\ldots in}, projections, inductive types, pattern matching and (co)fixpoints. The encoding is quite straightforward.