Natural Language Processing

## Rule Based Matching in Spacy

Rule based matching is a very useful feature in Spacy. It allows you to extract the information in a document using a pattern or a combination of patterns.

I will use the Obama speech in http://obamaspeeches.com/ as illustration. I would like to extract the number of times Obama said “America” in this speech. You can use rule based matcher in Spacy to parse the text and extract the information as follows:

from spacy.matcher import Matcher

matcher = Matcher(nlp.vocab)
pattern = [{"TEXT": "America"}]

doc = nlp(text)
matches = matcher(doc)
count = 0
for _ in matches:
count = count +1
print("No of times Obama used America is ",count)

Output:
No of times Obama used America is 10

May 23, 2021