Regular Expression – Commonly Used Functions

In this lesson, we will learn the five most important and commonly used functions in regular expression i.e., re.search(), re.match(), re.sub(), re.finditer() and re.findall(). Apart from that, we will also discuss word boundaries in the regular expression.

Till now we have explored only one function in the ‘re’ module i.e., re.search() function. But this is not the only function we use in the regular expression. Now we will learn some other commonly used functions in the regular expression.

1. Match Function – re.match():

The Match function returns a non-empty match only if the pattern is matched or present at the very beginning of the string. As we have studied the search function so far, we found that the search function scans the pattern starting from the left of the string and keeps searching until it sees the pattern and then returns the match.

Let’s understand this function by practicing one problem using re.match() function.

Exercise 1: re.match()

Description

Write a string such that when you run the re.match() function on the string using the given regex pattern ‘\d+’, the function returns a non-empty match.

Regular_Expression_Basics1

As we can see from the above code, based on the pattern i.e., matching numeric digits, we found that the cases which have a numeric pattern only at the beginning of the string are matched as re.match() returns a non-empty match only if the match is present at the very beginning of the string. That’s the reason 1000_lakhs and 60 return true whereas others return false.

The next function that we are going to learn is the substitute function.

2. Substitute Function – re.sub()

The substitute function in the regular expression is used to substitute a substring with another substring of our choice. 

For example, we may want to replace the American spelling ‘neighbor’ with the British spelling ‘neighbour’.

Further, we generally use re.sub() function for the text cleaning tasks. It can be used to replace all the special characters in a given string with a common string, such as SP_CHAR, to represent all the special characters in the text.

The re.sub() function is used to replace a part of your string using a regex pattern. It may also be possible that we want to replace a substring in a given input string where the substring has a particular pattern that can be matched by the regex engine and then it is replaced by the re.sub() function. 

For example, if we want to replace all the digits in a communication address with a certain string let’s say ‘XXX’. Then we can do the same using the below code.

# pattern for finding all numeric digits
pattern = "\d"
# String which we want to replace with
to_replace= "XXX"
# input string in which we want to substitute
input_str = "My address is 35, Napier Road Colony Part 1, Uttar Pradesh - 226003"

# substitute re function
re.sub(pattern, to_replace, input_str)

Exercise 2: re.sub()

Description

You are given the following string: 

“You can reach us at 08584986756 or 03361562153”

Substitute all the 11-digit phone numbers present in the above string with “#”. 

Regular_Expression_Basics1