Grouping by Regex in Oracle PL/SQL
29 Mar 2020
With Oracle PL/SQL you can use built in regular expression functions to match query results using SELECT statements. More on that documentation can be found here in the oracle reference documentation.
This entry will use the REGEX_SUBSTR() function to find strings that match the given pattern. To group by substring results, a nested select statement is used to refer to the named column. As a reference we will use Oracle’s built in HR schema Employee table.
As an example we can determine the maximum salary for each department using only the JOB_ID field:
1. Explore the data
FIRST_NAME | LAST_NAME | JOB_ID | SALARY |
---|---|---|---|
Steven | King | AD_PRES | 24000 |
Neena | Kochhar | AD_VP | 17000 |
Lex | De Haan | AD_VP | 17000 |
Alexander | Hunold | IT_PROG | 9000 |
Bruce | Ernst | IT_PROG | 6000 |
David | Austin | IT_PROG | 4800 |
Valli | Pataballa | IT_PROG | 4800 |
... | ... | ... | ... |
2. Use a nested select statement to extract the department from JOB_ID and calculate the maximum salary by department using a GROUP BY clause. The pattern ‘^\w{2}’ means to select the first two alphabet characters at the start of the string.
DEPARTMENT | MAX(SALARY) |
---|---|
FI | 12008 |
ST | 8200 |
HR | 6500 |
AD | 24000 |
SA | 14000 |
MK | 13000 |
PR | 10000 |
AC | 12008 |
IT | 9000 |
PU | 11000 |
SH | 4200 |
A useful tool for checking regular expressions is regex101.