Concepts
- formal language: set of strings
- string: a sequence of symbols from an alphabet
- alphabet: a finite set of symbols
- finite automata: can be used to recognize certain languages
- deterministic finite automata: DFAs, have exactly one transition out of each state for each symbol in the alphabet
- two DFAs equivalent if they recognize the same laguage
- in a NFA a string is accepted if there exits at least one path from the start state to accept state
- DFA and NFA's graph deepth is the number of symbols in the processing string
- regular language is a language that can recognize by a DFA or NFA or a regular expression
NFA to DFA conversion
- NFA to DFA conversion: thinking of a combination of the states in the NFA as a single state in DFA that you are constructing
- the subset construction: the state in DFA we are constructing is a subset of states in NFA
- the number of subsets is 2 power N(the number of states count)
- epsilon closure(q0): the epsilon closure of q0 is q0 itself plus all the states that can be reached from q0 via epsilon transitions
- the start state in the target DFA is the epsilon-closure of the start state in the original NFA
- a state(set of NFA states) in the DFA ia an accept state if at least one of the states that constitute the state corresponds to an accept state in the original NFA
- conversion make trap state explicit
Regular operations and Regular expression
- concatenation: every state belongs to that language has to have two pieces, a prefix and suffix.the prefix belongs to language one, the suffix belongs to language two.
- regular operations: union, concatenation, star
- precedence: star (exponential) concatenation (mult) union (add)
-
Union
-
Concatenation
-
Star
-
convert regular expression to NFA
- compiler: regular expression(define the language sytax) -> NFA (subset construction) -> DFA
Grammars
-
a context-free grammar is quadruple (V, T, P, S)
-
a linear grammar is a grammar in which there is at most one variable on the right-hand side of any production (grammar rule)
- a regular grammar is a grammar that all productions are left-linear or right-linear
- a language is regular if there exits at least one regular grammar that generates it