python - NLTK: filter sentences with specific structures -


how filter sentences specific structures using nltk? example, have following definition of context free grammar:

  1. s → np vp   2. s → aux np vp   3. s → vp   4. np → pronoun   5. np → proper-noun   6. np → det nominal   7. nominal → noun   8. nominal → nominal noun   9. nominal → nominal pp   10.vp → verb   11.vp → verb np   12.vp → vp pp   13.pp → prep np 

as can seen, 3 types of sentence structures defined:

  1. s → np vp   2. s → aux np vp   3. s → vp 

given following sentence, want know if sentence conforms of above 3 sentence structures.

i not country music has potential beauty, combined inclusions of comedy , sadness.

my question is, how should using nltk?

http://www.nltk.org/book/ch05.html

should explain need this. have first tokenize sentence (break individual tokens) , tagged corresponding pos nltk identifies them as.

this returns list of tuples, , there number of ways compare tuples ones in grammar.

specific code guard against faulty future links:

>>> text = word_tokenize("and different") >>> nltk.pos_tag(text) [('and', 'cc'), ('now', 'rb'), ('for', 'in'), ('something', 'nn'), ('completely', 'rb'), ('different', 'jj')] 

Comments