# Words & Parallel Universes

Let’s assume, a word in a parallel universe looks like a tree of letters instead of a sequence, and we cannot enumerate the letters in words anymore. Ask yourself these questions:

- How different would the languages look like?
- How much more complex would be the algorithms?
- How could books adapt to the change?
- What would be the consequences?

In the following sections, I will introduce some of the concepts from the formal languages & automata, then re-define and analyze the grammar types.

## Introduction to Formal Languages & Automata

*If you don’t understand the concepts, you can find additional sources at the end.*

**Important Definitions**

*Alphabet*: A set of letters *Σ = {a_{1}, …, a_{m}}*. Each letter stands for a value.

*Word*: A sequence of letters *w = (w_{1}, w_{2}, …, w_{l})* or simply *w_{1} w_{2} … w_{n}*, where *∀ i ∈ {1, …, l} . w_{i} ∈ Σ*. Each word stands for an argument.

*Language*: A set of words. Each language stands for a problem, which accepts / rejects words.

*Grammar*: Basically a set of rules and letters, which define the syntax of a language. It can create any word in a language. Two types of grammars worth considering in this article:

*Regular (or type-3) grammar*: It can create a word from left to right**[1]*(1 direction)*Context-free (or type-2) grammar*: can create a word in both directions (2 direction)

*Automaton*: Similarly a set of states and letters, which define the syntax. It accepts the words of a language, and rejects other words.

*Word Problem*: The algorithmic problem of whether a word is accepted by a language. In general, word problems can be solved using top-down recursive formulas**[2]*. If root node satisfies the criteria, the word is included in the language.

*Notes:*

**[1]: There are left-regular grammars which can work in reverse direction, each forms a reverse language of a right-regular language, so each grammar can be represented by right-regular grammars.*

**[2]: Not always word problems are solved recursively. Word problem of regular grammars requires O(n) ( = linear) time bound and context-free grammars O(n^3) ( = cubic) time using a naive implementation.*

## Words & Grammars in Our Universe

**Words**

Previously, we have defined words as linked lists, sequences, list of symbols.

Not only a linked list can store a word, but also a linear graph with:

- Nodes:
*{w_{1}, …, w_{n}}* - Edges:
*{{w_1, w_2}, …, {w_{n-1}, w_{n}}*.

In the illustration, we can see a linear graph representing a word, which starts from node *w_{1}* and ends at node* w_{7}*.

In this case, we can also make an observation, that each node can have max. 2 neighbors.

**Grammars**

A grammar *G = (V, Σ, P, S) *consists of:

- Set of variables, e.g.
*V ={A_{1}, A_{2}, …, A_{r}}* - Set of symbols, e.g.
*Σ = {a_{1}, a_{2}, …, a_{m}}* - Set of rules, e.g.
*P = {A_{1} → A_{1}a_1 | a_1A_{1} | a_{1}, A_{2} → A_{3}A_{4}}* - Starting variable, e.g.
*S = A_{j}*

Rules of grammars look like as follows:

- Regular grammar:
*A → ε | h| Bh | hC* - Context-free grammar:
*A → ε | h| Bh | hC | BC |*

## Other Universes

We have seen that normally a word can be represented as a graph in total order.

Now we assume there is a parallel universe called 𝒜.

The words in 𝒜 are represented by graphs in partial order.

In opposite to a totally ordered graph, not all pairs of nodes can be compared, that’s why a graph in partial order has more than one linear representations.

That means, words in 𝒜 cannot be defined using one string.

Instead, we define words as d-ary trees where:

- n is the number of nodes,
- ε is the root node and
- d is the chromatic index ( = max. number of neighbor connections) and it is a number in range of 2 to m, also the number of letters in Σ.

Each node can be named by its history of descendants, so that the variables can be unique, which is why backtracking fits very well for naming convention. Then the words are defined recursively using backtracking. Let h the depth of recursion:

*Recursion start (h = 0)*: *w_{()} = ε* is the blank word.

*Recursion step (h > 0)*: the node *ww_{i_{h}} = w_{(i_{1}, …, i_{h-1}, i_{h})}* at level h has a parent node *w = w_{(i_{1}, …, i_{h-1})}* and is connected to the parent over *i_{h}*-th letter in the alphabet.

# Open Questions & Conjectures

## How would a regular language in 𝒜 look like?

Languages created by regular grammars are considered as read-only (or write-but-never-look-back) languages.

Applying it to the parallel universe, it could be visualized as a DAG (Directed Acyclic Graph).

## How would a context-free language in 𝒜 look like?

Context-free grammars have a more complex definition than regular grammars.

Instead of regular language, letters can be appended at the right-end, or left-end. In this context, left means root and right means leaves.

It can be illustrated as an undirected graph, in opposite to regular grammars in 𝒜.

## Is a word in 𝒜 equal to a language in our universe?

Yes. As we have mentioned before, a word in 𝒜 is saved in a graph in partial order.

So there are many configurations that a graph can have, because not all pairs can be compared.

If each end configuration ( = path starting from the root node, ending at a leave) is stored in a set of words, it becomes a language in our universe.

Then a language in 𝒜 is equivalent to a class of languages.

## What are the complexity classes of these grammars in 𝒜?

*Regular grammars*: There is no efficient algorithm, which can solve word problem in altered regular grammar in polynomial time, because the word is non-deterministic and each path of word starting from root node is polynomial-length. It is a NP problem, and can be solved using dynamic programming.

*Context-free grammars*: Word problem in altered context-free grammars cannot be solved using NP-algorithms, because it has to store the directions for each step in polynomial-space. It is a PSPACE problem.

In the following section, we’ll justify those statements in context of most representative logical problems of NP and PSPACE. Let’s start with a quick introduction:

**NP vs PSPACE**

A NP problem can be depicted as a game where a player explores an environment and tries to reach a happy ending. If a desired solution exists, the player wins, otherwise loses.

However, a PSPACE problem is illustrated as a game where two players play against each other using different strategies. If one player has a winning strategy, the other player loses.

Regular and context-free grammars can be seen as 1-player and 2-player games respectively.

**The most representative problems**

Given a boolean formula F with arguments *x_{1}, …, x_{n} ∈ {0, 1}*:

*SAT (Satisfiability Problem, NP-complete): *Evaluate the statement *∃ x_{1} : … : ∃ x_{n} : F(x_{1}, …, x_{n})* and return a true solution if possible.

*QSAT (Quantified Satisfiability Problem, PSPACE-complete)*: Evaluate the statement *Q_{1} x_{1} : … : Q_{n} x_{n} : F(x_{1}, …, x_{n}); Q_{1}, …, Q_{n} ∈ {∃, ∀}* and return a true solution for each existential quantified variable if possible.

Given:

- an alphabet
*Σ* - a word argument
*w*of length h - the characteristic function of rules
*E*so that*E((x, y)) = 1, if x → y*is a valid rule, otherwise 0.

The top-down boolean formulas verifying that a word is accepted in a language:

Regular Grammars

Context-free Grammars

**Complexity of regular grammars in 𝒜**

As we can see, the first formula includes only ∃ (existential quantifier) at any step. If the recursion is eliminated, it becomes:

*∃ x_{1} : … : ∃ x_{n} : (A(x_{1}) ∧ A(x_{1}x_{2}) ∧ … ∧ A(x_{1}…x_{n}))*

and it can be solved using SAT.

**Complexity of context-free grammars in 𝒜**

Besides existential quantifier, the second formula includes but also ∀ (universal quantifier). If the recursion would be eliminated, it could be solved by a QSAT algorithm.

## How could this adjustment affect type-1 and type-0 languages?

I leave this question unanswered. But it is interesting to note:

*Context-sensitive (type-1) languages*: They can additionally have letters on the left ad right side of rules, in opposite to context-free languages.

- For example:
*jAk → jak.* - The complexity: NLINSPACE (non-deterministic Turing Machine uses linear space).

In case of **𝒜**, the number of possible combinations on the right side of the rule is proportional to the number of all subsets of alphabet.

That’s why the number of branches for each node scales with 2^d (the number of letter combinations) instead of d ( = the number of letters).

So my guess for the complexity of type-0 and type-1 languages would be EXPTIME-hard, because it would take exponential time at each recursion step.

## Final Thoughts

Firstly, we made a brief introduction to the formal languages & automata, complexity theory and logical problems.

Then we applied what we have learned in describing a parallel universe where words are represented using trees (instead of linked lists), and analyzed the behaviour of regular and context-free grammars among grammar types.

Using satisfiability problems and single-player / multi-player game analogy, we showed that the regular grammars become a NP problem, whereas the context-free grammars become a PSPACE problem.

Finally, we suggested that the lower types of grammars would be at least EXPTIME-hard.

I hope you found the article inspiring, thank you for your interest, and feel free to contact me on LinkedIn if you think I made a mistake somewhere!

*This article is built on top of my other articles:*

*And also special thanks to **csacademy.com** for interactive GUI for designing graphs and **overleaf.com** for LaTeX rendering.*

*Snapshots with unknown sources are all taken by me.*