Following are few bulit-in regular expression functions in ORACLE 11g Version.
REGEXP_COUNT:
REGEXP_COUNT returns the number of times a pattern occurs in a source string. The function evaluates strings using characters as defined by the input character set. It returns an integer indicating the number of occurrences of pattern. If no match is found, then the function returns 0.
Syntax:
REGEXP_COUNTSOURCE_STRING, PATTERN, 3, ‘i’)
- SOURCE_STRING is a character expression that serves as the search value. Its data type may be CHAR, VARCHAR2, NCHAR, NVARCHAR2, CLOB, or NCLOB.
- PATTERN is the regular expression. It is usually a text literal and can be of any of the data types CHAR, VARCHAR2, NCHAR, or NVARCHAR2. It can contain up to 512 bytes.
- Note: If the data type of pattern is different from SOURCE_STRING, then Oracle will converts PATTERN to the data type of SOURCE_STRING.
.
- POSITION is a positive integer indicating the character of SOURCE_STRING where Oracle should begin the search. The default is 1 . It will search from the first character of SOURCE_STRING. After finding the first occurrence of PATTERN, the database searches for a second occurrence beginning with the first character following the first occurrence.
- MATCH_PARAM is a text literal to change the default matching behavior of the function. You can specify one or more of the following values for MATCH_PARAM:
- ‘i’ specifies case-insensitive matching.
- ‘c’ specifies case-sensitive matching.
- ‘n’ allows the period (.), which is the match-any-character character, to match the newline character. If you omit this parameter, then the period does not match the newline character.
- ‘m’ treats the source string as multiple lines. Oracle interprets the caret (^) and dollar sign ($) as the start and end, respectively, of any line anywhere in the source string, rather than only at the start or end of the entire source string. If you omit this parameter, then Oracle treats the source string as a single line.
- ‘x’ ignores whitespace characters. By default, whitespace characters match themselves.
You can give one or more match pattern. But if it is contradictory, then ORACLE will take last given match Pattern.
Examples
The following example shows that sub expressions parentheses in pattern are ignored:
SQL>SELECT REGEXP_COUNT(‘abcabcabcabcabc’, ‘(ab)c’, 1, ‘i’) REGEXP_COUNT FROM DUAL;
REGEXP_COUNT
————
5
SQL>SELECT REGEXP_COUNT(‘abcabc’, ‘(ab)c’, 3, ‘i’) REGEXP_COUNT FROM DUAL;
REGEXP_COUNT
————
1
Other functions are
REGEXP_LIKE(source, regexp, modes) is probably the one you’ll use most. You can use it in the WHERE and HAVING clauses of a SELECT statement. In a PL/SQL script, it returns a Boolean value. You can also use it in a CHECK constraint. The source parameter is the string or column the regex should be matched against. The regexp parameter is a string with your regular expression. The modes parameter is optional. It sets the matching modes.
SELECT * FROM mytable WHERE REGEXP_LIKE(mycolumn, 'regexp', 'i');
IF REGEXP_LIKE('subject', 'regexp') THEN /* Match */ ELSE /* No match */ END IF;
ALTER TABLE mytable ADD (CONSTRAINT mycolumn_regexp CHECK (REGEXP_LIKE(mycolumn, '^regexp$')));
REGEXP_SUBSTR(source, regexp, position, occurrence, modes) returns a string with the part of source matched by the regular expression. If the match attempt fails, NULL is returned. You can use REGEXP_SUBSTR with a single string or with a column. You can use it in SELECT clauses to retrieve only a certain part of a column. The position parameter specifies the character position in the source string at which the match attempt should start. The first character has position 1. The occurrence parameter specifies which match to get. Set it to 1 to get the first match. If you specify a higher number, Oracle will continue to attempt to match the regex starting at the end of the previous match, until it found as many matches as you specified. The last match is then returned. If there are fewer matches, NULL is returned. Do not confuse this parameter with backreferences. Oracle does not provide a function to return the part of the string matched by a capturing group. The last three parameters are optional.
SELECT REGEXP_SUBSTR(mycolumn, 'regexp') FROM mytable;
match := REGEXP_SUBSTR('subject', 'regexp', 1, 1, 'i')
REGEXP_REPLACE(source, regexp, replacement, position, occurrence, modes) returns the source string with one or all regex matches replaced. If no matches can be found, the original string is replaced. If you specify a positive number for occurrence (see the above paragraph) only that match is replaced. If you specify zero or omit the parameter, all matches are replaced. The last three parameters are optional. The replacement parameter is a string that each regex match will be replaced with. You can use the backreferences \1 through \9 in the replacement text to re-insert text matched by a capturing group. You can reference the same group more than once. There’s no replacement text token to re-insert the whole regex match. To do that, put parentheses around the whole regexp, and use \1 in the replacement. If you want to insert \1 literally, use the string '\\1'. Backslashes only need to be escaped if they’re followed by a digit or another backslash. To insert \\ literally, use the string '\\\\'. While SQL does not require backslashes to be escaped in strings, the REGEXP_REPLACE function does.
SELECT REGEXP_REPLACE(mycolumn, 'regexp', 'replacement') FROM mytable;
result := REGEXP_REPLACE('subject', 'regexp', 'replacement', 1, 0, 'i');
REGEXP_INSTR(source, regexp, position, occurrence, return_option, modes) returns the beginning or ending position of a regex match in the source string. This function takes the same parameters as REGEXP_SUBSTR, plus one more. Set return_option to zero or omit the parameter to get the position of the first character in match. Set it to one to get the position of the first character after the match. The first character in the string has position 1. REGEXP_INSTR returns zero if the match cannot be found. The last 4 parameters are optional.