The categorical contours of the Chomsky-Sch\"utzenberger representation theorem
We develop fibrational perspectives on context-free grammars and on nondeterministic finite-state automata over categories and operads. A generalized CFG is a functor from a free colored operad (aka multicategory) generated by a pointed finite species into an arbitrary base operad: this encompasses...
Saved in:
| Published in: | Logical methods in computer science Vol. 21, Issue 2 |
|---|---|
| Main Authors: | , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Logical Methods in Computer Science e.V
16.05.2025
|
| Subjects: | |
| ISSN: | 1860-5974, 1860-5974 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | We develop fibrational perspectives on context-free grammars and on nondeterministic finite-state automata over categories and operads. A generalized CFG is a functor from a free colored operad (aka multicategory) generated by a pointed finite species into an arbitrary base operad: this encompasses classical CFGs by taking the base to be a certain operad constructed from a free monoid, as an instance of a more general construction of an \emph{operad of spliced arrows} $\mathcal{W}\,\mathcal{C}$ for any category $\mathcal{C}$. A generalized NFA is a functor from an arbitrary bipointed category or pointed operad satisfying the unique lifting of factorizations and finite fiber properties: this encompasses classical word automata and tree automata without $\epsilon$-transitions, but also automata over non-free categories and operads. We show that generalized context-free and regular languages satisfy suitable generalizations of many of the usual closure properties, and in particular we give a simple conceptual proof that context-free languages are closed under intersection with regular languages. Finally, we observe that the splicing functor $\mathcal{W} : Cat \to Oper$ admits a left adjoint $\mathcal{C}: Oper \to Cat$, which we call the \emph{contour category} construction since the arrows of $\mathcal{C}\,\mathcal{O}$ have a geometric interpretation as oriented contours of operations of $\mathcal{O}$. A direct consequence of the contour / splicing adjunction is that every pointed finite species induces a universal CFG generating a language of \emph{tree contour words.} This leads us to a generalization of the Chomsky-Sch\"utzenberger Representation Theorem, establishing that a subset of a homset $L \subseteq \mathcal{C}(A,B)$ is a CFL of arrows if and only if it is a functorial image of the intersection of a $\mathcal{C}$-chromatic tree contour language with a regular language. |
|---|---|
| ISSN: | 1860-5974 1860-5974 |
| DOI: | 10.46298/lmcs-21(2:12)2025 |