30 Days Of JavaScript.12
Day 12. Regular Expressions
On
Day 12
, I learned about regular expressions. I have learned the concept at Day 2, but in this lesson I could learn more deeply about how to use them. Let’s read the documents and examples to become used to various expressions!
Regular Expressions
A regular expression, or RegExp is a small programming language that helps to find pattern in data. A RegExp can be used to check if some pattern exists in a different data types.
-
How to use RegExp in JavaScript
1) Use RegExp constructor
2) Declare a RegExp pattern: two forward slashes(\) followed by a flag
- To declare a string
- single quote(‘ ‘)
- double quote(“ “)
- backtick(``)
- To declare a regular expression
- two forward slashes(\) and an optional flag
- the flag could be: g, i, m, s, u, y
1. RegExp parameters
A regular expression takes two parameters: a search pattern(required) and a flag(optional).
Pattern
A pattern could a be a text or any form of pattern that has similarity. For instance, the word ‘spam’ in an email could be a pattern. A pattern is something I’m interested to look for. A pattern could be a word, a number, etc.
Flags
Flags are optional parameters in a regular expression which determine the type of searching.
-
g: global flag. It looks for a pattern in the whole text.
-
i: ignore case. It searches case insensitively, for both lowercase and uppercase.
-
m: multiline
2. Creating a pattern with RegExp Constructor
/pattern/flags
new RegExp(pattern[, flags])
RegExp(pattern[, flags])
1) Declaring regular expression without flags
let pattern = 'love';
let regEx = new RegExp(pattern);
2) Declaring regular expression with g
flag and i
flag
let pattern = 'love';
let flag = 'gi'
let regEx = new RegExp(pattern, flag);
3) Declaring regular expression with flags + writing the flag inside the RegExp constructor
let regEx = new RegExp('love', 'gi');
3. Creating a pattern without RegExp Constructor
To declare a regular expression, use two forward slashes(\) and an optional flag.
// without using RegExp constructor
let regEx = /love/gi;
// using RegExp constructor
let regEx = new RegExp('love', 'gi');
4. RegExp Object Methods
1) testing for a match : test()
regexp.test(string)
test()
is used when I want to know if the pattern exists in a string. It returns true or false.
const str = 'I love JavaScript';
const pattern = 'love'
let result = pattern.test(str);
console.log(result); // true
A string method
search()
returns the index of a match(or -1 if the pattern is not found).
A RexExp methodtest()
returns a boolean.
2) array containing all of the match: match(), search()
① match()
string.match(regexp)
Return value:
-
if the match is not found:
null
-
with
g
flag: all results matching the complete regular expression will be returned. Capturing groups will not be returned. -
without
g
flag: only the first complete match and its related capturing groups are returned. The returned item will have additional properties:index
,input
,groups
.
groups
:
an object of named capturing groups, whose keys are the names and values are the capturing groups.
undefined
if no named capturing groups were defined.
index
: the index where the result was found
If I search for a word, match()
without g
flag will return the first index of the word detected.
input
: a copy of the search string
// if the match is not found
const str = 'I love JavaScript';
const pattern = /art/;
const result = str.match(pattern);
console.log(result); // null
// with 'g' flag
const str = 'I love JavaScript';
const pattern = /[A-Z]]/g;
const result = str.match(pattern);
console.log(result);
// (3) ['I', 'J', 'S']
// without 'g' flag
const str = 'I love JavaScript';
const pattern = /love/;
const result = str.match(pattern);
console.log(result);
// ['love', index: 2, input: 'I love JavaScript', groups: undefined]
② search()
search()
is used when I want to know whether a pattern is found, and also know its index within a string.
string.search(regexp)
Return value:
-
if found: the index of the first match between the regular expression and the given string.
-
if not found:
-1
const str = 'I love JavaScript';
const pattern = /love/g;
const result = str.search(pattern);
console.log(result); // 2
3) replacing a substring: replace()
string.replace(regexp, newSubstr)
string.replace(regexp, replacerFunction)string.
string.replace(substr, newSubstr)
string.replace(substr, replacerFunction)
replace()
executes a search for a match in a string, and replaces the matched substring with a replacement substring.
const txt = 'If it rains, I will take my umbrella. The Umbrella Academy is so fun.';
newTxt = txt.replace(/Umbrella|umbrella/, 'Sparrow')
console.log(newTxt);
// If it rains, I will take my Sparrow. The Umbrella Academy is so fun.
I can use OR operator(|
) to find both lowercase and uppercase words. Using i
flag will do the same job more easily. One more thing. Since the original string variable is declared as const
, I should set another variable which is to be replaced.
const txt = 'If it rains, I will take my umbrella. The Umbrella Academy is so fun.';
newTxt = txt.replace(/Umbrella/gi, 'Sparrow')
console.log(newTxt);
// If it rains, I will take my Sparrow. The Sparrow Academy is so fun.
Since I used both g
flag and i
flag, replace()
method will find the pattern in the whole text(g
) case-insensitively(i
).
4) How to set a regular expression
- Link to read: Regular expressions
① [ ]
: a set of characters
[a-c]: a or b or c
[a-z]: any letter between a and z(lowercase)
[A-Z]: any letter between A and Z(uppercase)
[0-3]: 0 or 1 or 2 or 3
[0-9]: any number between 0 and 9 (same as `\d`)
[A-Za-z0-9]: any character between A and Z, a and z, 0 and 9
- Using square bracket(
[]
) with/withoutg
flag
// without 'g' flag
const pattern = '[Aa]pple';
const txt = 'Apple is delicious. I like an apple juice.'
const matches = txt.match(pattern);
console.log(matches);
// ['Apple', index: 0, input: 'Apple is delicious. I like an apple juice.', groups: undefined]
// with 'g' flag
const pattern = /[Aa]pple/g;
const txt = 'Apple is delicious. I like an apple juice.'
const matches = txt.match(pattern);
console.log(matches); // (2) ['Apple', 'apple']
In the upper code, [Aa] means ‘either A or a’.
- Using square bracket(
[]
) and or operator(|
)
const pattern = /[Aa]pple|[Oo]range/g
const txt = 'Apple is delicious. I like apple juice and orange juice. Orange has a nice flavor.'
const matches = txt.match(pattern);
console.log(matches);
// (4) ['Apple', 'apple', 'orange', 'Orange']
② \
: escape from special characters
-
\d
: one digit - match where the string contains digits(0-9) -
\D
: one non-digit - match where the string doesn’t contain digits, including space(‘ ‘)
- Using escape character(
\
) with/without+
// without +
const pattern = /\d/g;
const txt = 'Today is May 11, 2022.';
const matches = txt.match(pattern);
console.log(matches);
// (6) ['1', '1', '2', '0', '2', '2']
// with +(one or more times)
const pattern = /\d+/g;
const txt = 'Today is May 11, 2022.';
const matches = txt.match(pattern);
console.log(matches); // (2) ['11', '2022']
// \D
const pattern = /\D/g;
const txt = 'Today is May 11, 2022.';
const matches = txt.match(pattern);
console.log(matches);
// (16) ['T', 'o', 'd', 'a', 'y', ' ', 'i', 's', ' ', 'M', 'a', 'y', ' ', ',', ' ', '.']
// \D+
const pattern = /\D+/g;
const txt = 'Today is May 11, 2022.';
const matches = txt.match(pattern);
console.log(matches);
// (3) ['Today is May ', ', ', '.']
‘d’ is a special character that means ‘digits’.
③ .
: any character except the newline character(\n)
[a].
finds a pattern that starts with ‘a’, followed by any one character except newline(\n).
// search aㅁ
const pattern = /[a]./g;
const txt = 'Apple and banana are fruits.';
const matches = txt.match(pattern);
console.log(matches);
// (5) ['an', 'an', 'an', 'a ', 'ar']
// search aㅁㅁ
const pattern = /[a]../g;
const txt = 'Apple and banana are fruits.'
const matches = txt.match(pattern);
console.log(matches);
// (3) ['and', 'ana', 'a a']
// search aㅁㅁ...
const pattern = /[a].+/g;
const txt = 'Apple and banana are fruits.'
const matches = txt.match(pattern);
console.log(matches);
// ['and banana are fruits.']
④ ^
(Caret) : start of string / negation
- Using
^
as ‘starts with’
^word
: a sentence which starts with word
const str = 'Learning JavaScript is fun.';
// test if the string starts with 'l'
console.log(/^l/i.test(str)); // true
// without 'g' flag
const txt = 'This is Ramona. This lesson is interesting.'
const pattern = /^This/;
const matches = txt.match(pattern);
console.log(matches);
// ['This', index: 0, input: 'This is Ramona. This lesson is interesting.', groups: undefined]
// with 'g' flag (no difference)
const txt = 'This is Ramona. This lesson is interesting.';
const pattern = /^This/g;
const matches = txt.match(pattern);
console.log(matches); // ['This']
- Using
^
as negation(in a set character)
[^abc]
: not a, not b, not c
const txt = 'This is Ramona. It is 11 in the morning.';
// not A-Z, not a-z, not ., not ,, not space( )
const pattern = /[^A-Za-z., ]+/g;
const matches = txt.match(pattern);
console.log(matches); // ['11']
// without '+', ['1', '1'] is returned
⑤ $
: end of string()
word$
: a sentence which ends with word
④+⑤ ^...$
: exact match
^
and $
can be used together like ^...$
to find a full match.
/* a pattern that starts with A-Z(uppercase),
ends with a-z(lowercase), and has 3-12 characters */
let pattern = /^[A-Z][a-z]{3,12}$/;
let name = 'Ramona';
let result = pattern.test(name);
console.log(result); // true
// some experiments
let name = 'ramOnd';
console.log(result); // false
let name = 'RamONd';
console.log(result); // false
⑥ *
: zero or more times (same as {0,}
)
[word]*
: word
may not exist or occur many times
/* find a pattern that starts with 'a'
followed by any one or more characters except newline(\n) */
const pattern = /[a].*/g
const txt = 'Apple and banana are fruits.'
const matches = txt.match(pattern);
console.log(matches);
// ['and banana are fruits.']
⑦ +
: one or more times
[word]+
: word
may occur at least once or many times
⑧ ?
: zero or one times (same as {0, 1}
)
[word]?
: word
may not exist or occur only once, making the symbol (word
) optional
th?at
finds a pattern that starts with ‘t’, followed by an optional ‘h’(none or one) and ends with ‘at’.
const txt = 'hat that tat threat thhat';
// search tㅁat
const pattern = /th?at/g;
const matches = txt.match(pattern);
console.log(matches);
// (2) ['that', 'tat']
// notice that 'thhat' isn't included
const txt = 'Do you write color or colour?'
// search coloㅁr
const pattern = /colou?r/g;
const matches = txt.match(pattern);
console.log(matches); // (2) ['color', 'colour']
In the upper code, the pattern looks for ‘colo’ followed by zero or one ‘u’, and then ‘r’. It both returns ‘color’ and ‘colour’. ?
cannot be placed in the front.
⑨ {}
: the quantifier
I can specify the length of the substring that I look for in a text, using curly braces.
-
{3} : exactly 3 characters
-
{3,} : at least 3 characters
-
{3,8} : 3 to 8 characters
// exact length
const txt = 'Today is May 11 of the year 2022.';
const pattern = /\d{4}/g;
const matches = txt.match(pattern);
console.log(matches); // ['2022']
// range of length
const txt = 'Today is May 11 of the year 2022. I am learning coding for 48 days.';
const pattern = /\d{1,4}/g;
const matches = txt.match(pattern);
console.log(matches); // (3) ['11', '2022', '48']
⑩ |
: either / or
a|b
: either a or b
Leave a comment