Introduction of Alphabet, String, and Languages
This blog explain basic concepts to start Automata theory or Compiler design course. It also includes theorems with its proofs and solved examples.
Saturday, 14 May 2016
Thursday, 24 December 2015
Language Operations, Reflection (Image) of Language, Concatenation of Language, Set Operations, Star, Properties of Star, Cross, Special Cases, Related Theorems with its Proofs, and Related Examples with its Solutions.
Language Operations
It is simple to give an
operation, which are originally defined on strings, to an entire language: for it we just apply the operation to all
the sentences..
Reflection / image of a language
The reflection of a language L is the set of strings that are the reflection of a sentence:
LR = {x | x = yR ^ y Î L}
(it is known as characteristic predicate);
note that ^ is conjunction, represented by p ^ q, meaning “p and q”.
(it is known as characteristic predicate);
note that ^ is conjunction, represented by p ^ q, meaning “p and q”.
Here, the strings x are specified by the property expressed in the characteristic predicate.
In other words, for any language L, LR = {xR | x Î L},
for example, let:
L = {001, 10, 111} then LR = {100, 01, 111}.
Concatenation of languages
Concatenation of language L1 and L2 is defined as:
L1•L2 = {xy | xÎ L1 ^ yÎ L2}.
L1•L2 = {xy | xÎ L1 ^ yÎ L2}.
Example 1: Consider the following
table.
L1
|
L2
|
L1L2
|
{a, ab}
{a, ab}
{a, aa}
|
{bb, b}
{a, ab}
{a, aa}
|
{a, ab}{bb, b}
={abb, ab, abbb}
{a, ab}{a, ab}
= {aa, aab, aba, abab}
{a, aa}{a, aa}
= {aa, aaa, aaaa}
|
From this, the m-th power
operation on a language is simple:
Lm = Lm-1•L, m > 0
L0 = {e}
|
That is,
L1=L,
L2=LL,
L3= LLL, etc.
L0 = {e}.
L1=L,
L2=LL,
L3= LLL, etc.
L0 = {e}.
Special case 1: Let us consider some special
cases:
f0 = {e}
L•f = f•L = f
L• {e} = {e}• L = L
|
The m-th power operation gives a definition of the string of length not
exceeding some integer k. Let us consider
the alphabet S = {a, b}.
For k = 3, the language is defined as
L = S0 È S1 È S2 È S3.
= {e} È {a, b} È {aa,
ab, ba, bb} È {aaa, aab, aba, abb, baa, bab, bba, bbb}.
= {e, a, b,
aa, ab, ba, bb, aaa, aab, aba, abb, baa, bab, bba, bbb}.
Language L may also define as:
L = {e, a, b} 3.
Set Operations
Since a language is a set, we
can apply the set operations: union (È),
intersection (Ç), and difference (\), to languages. We can also apply the set
relation operations: inclusion (Í ),
strict inclusion (Ì), and equality (=) apply as well.
We can define universal language as the set of all strings of alphabet,S, of any
length, including zero. Thus, the universal
language is infinite and it is the union of all the powers of the alphabet:
Luniversal
= S0 È S1 È S2 È …
|
The
complement of a language L of
alphabet, S, denoted by Ø L is the set difference:
ØL = Luniversal
\ L
|
i.e., the
set of the strings of alphabet S those are not in L.
Luniversal = Ø f
|
Note 1: The complement of a finite language is always infinite, for example, the
set of strings of any length except two is:
Ø ({0,1}2) = e È {a,b}1 È {a,b}3 È ……….
On the other hand, the complement of an infinite language may
or may not be finite, For example,
the complement of the universal language is finite; and the complement of
the set of even length strings with alphabet {a} is:
L = {a2n | n ³ 0} Ø L = {a2n+1 | n ³ 0}
i.e., L = {a0, a2, a4 ……} is infinite
language and its complement Ø L= {a1, a3, a5,….} is also a infinite language.
Star and cross
The star (also known as Kleene’s
star and as closure by concatenation)
operation is defined as the union of all
the power of the base language:
L* = È La
a = 0…∞
|
Þ L* = L0 È L1
È L2 È …..
Þ L* = e È L È L2
È …..
Example 2: Let us
consider L= {ab, ba}, then:
L* = {e, ab,
ba, abab, abba, baab, baba, ……}.
Every string of the star can
be divided into substrings which are sentences
(refer to definition
of sentence) of the base language L.
Note that, with a
finite base language L, the “starred” language
L* is infinite.
Starred language L* may be identical to base language L, as in:
L = {a2n | n ³ 0} and L*
= {a2n | n ³ 0}.
Thus, L = L*.
Special case 2: when the base language is
an alphabet, S, then the star S*
contains all the strings obtained by
concatenating terminal symbols.
S = {a}. Let it is the base language.
S* = {e, a, aa,
aaa, …}.
Þ S* = S0 È S1 È S2 È …
Þ S* = Luniversal.
Þ S* = universal language of
alphabet, S.
We know that any formal language is a subset of the
universal language of the same alphabet.
Therefore, we can also write a relation:
L
|
To say that L is a language of alphabet, S.
Properties of star
L Í L* :(known
as monotonicity property).
if (x Î L* ^ y Î L*) then xy Î L* :(known as closure
by concatenation property).
(L*)* = L* :(known
as idempotence property).
(L*)R = (LR)* :(known as commutability of star and reflection
property).
The monotonicity property says that
any language is included in its star, i.e., L Í L*. But
for the language, L = {a2n | n
³ 0} we have an equality L* = L. This equality follows from idempotence property and the fact that L can be equivalently defined by the starrer language L* which is
equivalent to {aa}*.
Special case 3: For the empty language and empty string we have the identities.
f* = {e}.
{e}* = {e}.
|
Cross
Cross (or non reflective closure by concatenation)
is derived from star.
L+ = È La
a = 1
… ∞
|
Þ L+ = L È L2 È …..
It differs from the star because the union is taken excluding power zero. Thus, the following relations
hold:
L+ Í L*
e Î L+ if and only if e Î L
L+ = LL* = L*L
|
Example 3: {e, aa}+ = {e, aa,
aaaa, …..}. Therefore, {e, aa}+ = {a2n | n ³ 0}.
Theorem 1: Prove that for every language L, (L*)* = L*.
Proof: According
to definition, S* is a language in which each string is a concatenation of zero or
more strings fromS. Formally:
S*= {w | w = e, or w =
x1……xk for k ³ 1, and xi Î S}.
Automatically, we get S Ì S*.
Let w Î (L*)*, we need
to prove wÎ L*.
When w = e, then w Î L*, because e Î L*, by the definition of star
closure.
Let us assume that w ¹ e. Then we can write w = x1….xk for
some k ³ 1 (for example, let w = puja, then x1 = p, x2
= u, x3 = j, and x4 = a;) and all xi Î L* (for example, let L = {p, u, j, a} then L*
= {e, p, u, j, a,….}.
Therefore, we can re-state the above statement as: “show that for any k ³ 1, if w
= x1…..xk, where all xi Î L*, then w Î L*”.
Let us proof the re-stated
statement by using induction on k.
Base case: when k = 1 then w = x1,
where x1 Î L*, so w Î L*.
Induction
theory: Let us
assume that the claim is true for k = i,
i.e., w = x1 ….. xi, where all xi Î L*, then w Î L*.
Induction step:
Let the claim is
true for k = i+1. Let w = x1…xixi+1,
where all xi Î L*.
By the
definition of concatenation, w = yxi+1,
where y = x1…xi.
We can also
state that, if x Î L* and y Î L*, then xyÎ L*.
Thus, by
induction theory: y Î L*, and xi+1 Î L*.
Therefore, we
conclude that: w = yxi+1 Î L* .
Thus, prove.
Theorem
2: Prove that L*• L*= L*.
Proof: It is very
clear that L*Í L* •
L*,
since any string w Î L* can be written as w = w•e. Let w Î L*• L*. We can express w as w1•w2, where w1•w2.Î L*. But if w1 and w2
are in L*, then so is
their concatenation, this implies that w Î L*, and hence L*• L* Í L*.
It follows that L*• L* = L*.
Thus, prove.
Theorem 3: Prove that "
i ³
1, (L*)i =
L*.
Proof: We can prove it
by induction.
Base case: We have, (L*)1 =
L*.
Induction theory: Let us assume that (L*)k = L* , for some k ³ 1.
Induction step: Let the claim is true for k+1, i.e., (L*)(k+1) =
L*.
We have,
(L*)(k+1) =
(L*)k • L* -
- - -(1)
Using the above
induction theory, we can write equality (1) as:
(L*)(k+1) = L*• L* -
- - -(2)
According to the
theorem 1.5, we know that L*•
L* = L*.
Thus, we can
write equality (2) as:
(L*)(k+1) =
L*
Means, the claim
is true for k + 1. Thus, we conclude that:
" i ³
1, (L*)i = L*.
Thus, prove.
Theorem 4: By the definition of star and above theorem,
prove that (L*)* = L*.
prove that (L*)* = L*.
Proof:
By the definition of star, we have:
(L*)* =
È (L*)a
a = 0…¥
Þ (L*)* =
L* (from result)
a = 0…¥
Þ (L*)* = L* (from the definition of star)
Thus, prove.
Theorem 5:
Let A be any set of string. Prove
that A* = A+ if
and only if e Î
A.
Proof: For any set A, we know that A0 = {e} and An = {w / w= w1 w2
w3………wn , wi Î A for each i = 1, 2, ………., n}, for n ³ 1. Also, A1 = A.
· First, we will prove that: if e Î
A, then A* = A+.
Let e Î A.
Thus, A = A1=
A0 È A1
= {e} È A. Means, A1= A0 È A1.
Therefore,
A+ =
A 1 È A
2 È A3
È……..
=> A+ =
A 0 È A
1 È A
2 È A3
È…….. (because A1=
A 0 È A
1).
=> A+ =
A* (because
A* = A 0 È A
1 È A
2 È A3
È…….).
Therefore, we have shown that: if e Î A, then A* = A+.
· Next, we will prove that: if A* = A+, then
e Î A. This is equivalent to showing that: if e Ï
A, then A* = A+.
Let
e Ï A.
Thus,
we get, A1¹ A0 È A1.
Also,
if e Ï A, then e Ï An for any n ³ 1, because we can never concatenate
non-empty strings together to obtain an empty string.
Therefore,
A* = A0 È A1 È A2 È … cannot equal to A+ = A1 È A2 È A3 È …… since e Î A* but e Ï A+.
Therefore,
we have shown that: if A*= A+,
then e Î A.
Thus, prove.
Example 1: Consider the following two language over
the alphabet, ∑ = {a, b}:
A = {ab, a, e, abb}
B
= {e, bb, b}
(a) List all the strings in the language
A•B of length 3.
(b) List all the strings in the language
A* of length 3.
(c) List all the strings in the language
A-B.
Solution: (a) abb.
(b) {aaa, aba, aab, abb}.
(c) {ab, a, abb}.
Example 2: For each of the following 5 languages, give
two least strings that are in (Î)
the language and two least strings that are not in (Ï) the language.
(1)
aa + aab (2)
b*(ab)* a* (3)
(a*+b*) (a*+b*) (a*+b*)
(4) a*(baa*)* b*
(5) b*(a+ba)*b*
Solution:
(1)
|
(2)
|
(3)
|
(4)
|
(5)
|
|
ab + aab
|
b*(ab)* a*
|
(a*+b*)
(a*+b*)(a*+b*)
|
a* (baa*)*
b+
|
b* (a+ba)*
b*
|
|
Î
|
ab, aab
|
e, a
|
e, a
|
e, a
|
e, a
|
Ï
|
e, a
|
aab, abb
|
abab, baba
|
bba, abba
|
abba, aabba
|
Example 3:
Consider the following two languages on
the alphabet
å = {0, 1};
L1
= {0n:
n ³
1};
L2
= {01n:
n ³
0}.
Describe the languages below.
(a) L3 = L1* (b) L4 = L1+
(c)
L5 =
(d) L6 = L2*
(e) L7 = L1Ç
L2 (f) L8 = L1 L2
Solution: (a) L3
= L1* = {0n: n ³ 0} = the set of the empty string and all
strings that have no 1’s.
(b) L4
= L1+= {0n: n ³ 1} = the set of all non-empty strings that
have
no
1’s.
(c) L5 =
=
{e} È {w: w include at least one 1} = the set of
empty string and all strings that have at least one 1.
(d) L6
= L2* =
{e} È {0w: w is any string over, å } = the set of empty string and all
strings that begin with 1.
(e) L7 = L1 Ç L2 = {0}.
(f) L8 = L1 L2 = {0m1n: m ³ 2
and n ³ 0}.
---------------------------------------------------------------------------------------------------------------------
For further detail study:
Refer to book:
AUTOMATA THEORY: A STEP-BY-STEP APPROACH Published By S. CHAND PUBLICATION, New Delhi. Author: MANISH KUMAR JHA
===========================================
Subscribe to:
Posts (Atom)