Phrase structure Grammars

i] To insert a sample PSG into your document, select from the editor
Insert⟶ Structure⟶PSG.

A phrase structure grammar (PSG) is a variation of a context-free grammar (CFG) that recognizes a language that is a set of sentences. A sentence is a finite sequence of words belonging to a @lex. A PSG is composed of 4 components∶
1A set of variables
2A lexicon
3A starting variable (S)
4A set of grammar rules

The lexicon is a set of words (each word consisting of one or more lowercase letters). Each sentence derived by a PSG is composed of words from the lexicon. The PSG uses a lexicon instead of the input alphabet Σ used by the CFG.

A variable in a PSG derives a word or a phrase of the sentence. By convention, the following variable names are often used in PSGs to denote certain types of phrases and parts of speech∶

Variable name
Meaning
Example
S
Sentence
the dog chases the cat
NP
Noun phrase
the dog
VP
Verb phrase
chases the cat
D
Determiner
the
N
Noun
dog
V
Verb
chases
The variable names above are conventional and have no special meaning in
Grafstate.

The variables and the lexicon in a PSG are determined by Grafstate automatically from the rules.

Grammar rules

A PSG makes use of two kinds of grammar rules∶

Phrase rules

A phrase rule defines how to expand a grammar variable into a phrase. (e.g., the cat or in the tree). A phrase rule has the following format∶
Variable ⟶ Expansion,

where Variable is one of the grammar variables and Expansion is a sequence of grammar variables describing a phrase.

The following line is a phrase rule that defines a simple noun phrase
NP -> D N

  D represents a determiner (e.g., the ) and N represents a noun (e.g., dog ).

Lexical rules

A lexical rule expands a grammar variable to a single word. The lexical rules are used to define the lexicon for the grammar. A lexical rule has the following format∶
Variable ⟶ word,

where Variable is one of the grammar variables and a word is composed entirely of lowercase letters.

The following line is a lexical rule that adds the word chases to the lexicon∶
V -> chases

  V represents a verb.

Syntax constraints for a PSG

Each line of a PSG is checked against the following constraints. If a line violates a constraint, then you will receive an error message.
Each variable name must begin with an uppercase letter.
The initial variable is called S .
Each phrase rule has the form A->VAR1 VAR2 ... , where A is a variable and VAR1 , VAR2 ,… are variables.
A lexical rule has the form A->word1 | word2| ... , where A a variable and word1 , word2 , … are words.
Each word consists entirely of lowercase letters.
Terms on the right-hand side of a rule must be separated by spaces.

Shorthand notation

You can write multiple rule expansions from a single variable on one line using the | symbol to join the expansions (the right-hand-sides) of the rules.

The line
NP -> D N | D A N
 defines the following 2 rules∶
1 NP -> D N
2 NP -> D A N

Let D , A , and N represent a determiner, an adjective, and a noun (respectively). Then rule 1 above defines a noun phrase ( NP ) to consist of a determiner followed by a noun, and rule 2 defines a noun phrase to consist of a determiner followed by an adjective followed by a noun.

The line
N -> dog | cat
 defines the following 2 rules∶
N -> dog
N -> cat

The 2 rules above add the words dog and cat to the lexicon.

The line NP -> D N | D A N is not read NP expands to D followed by A followed by either N or D followed by A followed by N. The correct interpretation is∶
 NP can expand to either D N or to D A N.

Example

The following is a description of a PSG that accepts some simple English sentences.
Phrase structure grammar
Variables:   V = {D,N,S,V,NP,VP}.
Lexicon:   λ = {cat,dog,the,chases}.
Initial variable:   V0 = S.
Rules:
     S → NP VP
     NP → D N
     VP → V
     VP → V NP
     D → the
     N → dog
     N → cat
     V → chases

The PSG described above can be constructed using the following Grafstate code∶
Code
:+ psg G
S->NP VP
NP->D N
VP->V|V NP
D->the
N->dog|cat
V->chases
done.

Simulating a PSG

The Grafstate Simulator will show all expansion steps that a PSG take when deriving an input string. If an input string w is derived by the a PSG, then the expansion steps are visualized in the simulation as a parse tree. Symbols from w that are read are shown in the parse tree as nodes with boxes. Variables in the expansion steps are shown in the parse tree as nodes without boxes.

The following Grafstate code simulates G on an input sentence∶
Code
:sim G the dog chases the cat

The parse tree shows how the sentence was derived∶
Simulation
Input the cat chases the dog
Result ACCEPTED
Explanation Below is the parse tree∶
Tree
Phrase structure grammar∶ G

If a sentence is not derived by a PSG, then the simulation will not show a parse tree. Instead, the Grafstate Simulator attempts to explain why the sentence could not be derived.

The following sentence is not derived by G∶
Code
:sim G the dog the chases the cat
Simulation
Input the dog the chases the cat
Result REJECTED
Explanation The string the dog the chases the cat cannot be read. Input can be read up to the highlighted the but no rules are available to read the. The available rules call for reading chases instead.
Phrase structure grammar∶ G