Silpa gave a table like this for testing:
create table network_table ( origin number , destination number ) /
And some data as well:
insert into network_table values (11, 12) / insert into network_table values (12, 13) / insert into network_table values (14, 11) / insert into network_table values (14, 15) / insert into network_table values (14, 16) / insert into network_table values (14, 17) / insert into network_table values (18, 11) / insert into network_table values (19, 110) / insert into network_table values (111, 112) / insert into network_table values (23, 15) / insert into network_table values (23, 24) / commit /
What I found very useful was drawing these data as nodes in a graph:
Silpas requirement was to start at a given node and select the entire network reachable from that node - no matter which direction the arrow points. If starting at for example node 11, the select should return all the blue (origin, destination) connections.
Silpa had tried using CONNECT BY hierachical queries, but it didn't quite work out. I came up with this solution using recursive subquery factoring instead:
with net (origin, destination, origin_recur, destination_recur, lvl)
as (
select n.origin
, n.destination
, 11 origin_recur
, case n.origin
when 11 then destination
else origin
end destination_recur
, 1 lvl
from network_table n
where n.origin = 11
or n.destination = 11
union all
select n.origin
, n.destination
, net.destination_recur origin_recur
, case n.origin
when net.destination_recur then n.destination
else n.origin
end destination_recur
, net.lvl + 1 lvl
from net
join network_table n
on (n.origin = net.destination_recur and n.destination != net.origin_recur)
or (n.destination = net.destination_recur and n.origin != net.origin_recur)
)
search breadth first by origin_recur, destination_recur set ordering
select origin
, destination
, origin_recur
, destination_recur
, lvl
from net
order by ordering
/
Giving this output of the 9 blue arrows from the drawing:
ORIGIN DESTINATION ORIGIN_RECUR DESTINATION_RECUR LVL
---------- ----------- ------------ ----------------- -----
11 12 11 12 1
14 11 11 14 1
18 11 11 18 1
12 13 12 13 2
14 15 14 15 2
14 16 14 16 2
14 17 14 17 2
23 15 15 23 3
23 24 23 24 4
The "trick" is that we create origin_recur and destination_recur. In the "starting" part of the subquery (before UNION ALL) origin_recur is always 11 (my starting point) and destination_recur is "the other node". In the recursive part of the subquery (after UNION ALL) we find the next level by joining on either origin or destination equal to "prior" destination_recur (plus we make sure we don't "double back" on ourselves.) And for that level we again make sure origin_recur is "where we came from" and destination_recur is "the other node", no matter which way the arrow originally points. In other words: origin_recur and destination_recur represents the blue arrows but they just all have the direction modified to always be the direction we traverse the graph.
This technique would have been impossible to do with a CONNECT BY, as the recursion is done on calculated columns. In a CONNECT BY clause PRIOR can only be used on table columns, not calculated column aliases. But that is possible in version 11.2 using the recursive subquery factoring.
The above output shows all level 1's first, then level 2, and so on. That is caused by the BREADTH FIRST clause. It can be changed to DEPTH FIRST:
with net (origin, destination, origin_recur, destination_recur, lvl)
as (
select n.origin
, n.destination
, 11 origin_recur
, case n.origin
when 11 then destination
else origin
end destination_recur
, 1 lvl
from network_table n
where n.origin = 11
or n.destination = 11
union all
select n.origin
, n.destination
, net.destination_recur origin_recur
, case n.origin
when net.destination_recur then n.destination
else n.origin
end destination_recur
, net.lvl + 1 lvl
from net
join network_table n
on (n.origin = net.destination_recur and n.destination != net.origin_recur)
or (n.destination = net.destination_recur and n.origin != net.origin_recur)
)
search depth first by origin_recur, destination_recur set ordering
select origin
, destination
, origin_recur
, destination_recur
, lvl
from net
order by ordering
/
Which gives us output that traverses all the way down each branch first before going to the next branch (looking perhaps a little more like you would be used to from CONNECT BY):
ORIGIN DESTINATION ORIGIN_RECUR DESTINATION_RECUR LVL
---------- ----------- ------------ ----------------- -----
11 12 11 12 1
12 13 12 13 2
14 11 11 14 1
14 15 14 15 2
23 15 15 23 3
23 24 23 24 4
14 16 14 16 2
14 17 14 17 2
18 11 11 18 1
So far so good, but what happens when you add another connection to the graph:
insert into network_table values (24, 11) /
Shown on the graph (the green arrow) we see we have a cycle:
If we run the sql again, we get this error:
ERROR at line 1: ORA-32044: cycle detected while executing recursive WITH query
Recursive subquery factoring has a CYCLE clause that can help us to avoid this error:
with net (origin, destination, origin_recur, destination_recur, lvl)
as (
select n.origin
, n.destination
, 11 origin_recur
, case n.origin
when 11 then destination
else origin
end destination_recur
, 1 lvl
from network_table n
where n.origin = 11
or n.destination = 11
union all
select n.origin
, n.destination
, net.destination_recur origin_recur
, case n.origin
when net.destination_recur then n.destination
else n.origin
end destination_recur
, net.lvl + 1 lvl
from net
join network_table n
on (n.origin = net.destination_recur and n.destination != net.origin_recur)
or (n.destination = net.destination_recur and n.origin != net.origin_recur)
)
search depth first by origin_recur, destination_recur set ordering
cycle destination_recur set is_cycle to 'Y' default 'N'
select origin
, destination
, origin_recur
, destination_recur
, lvl
, is_cycle
from net
order by ordering
/
Giving us this output:
ORIGIN DESTINATION ORIGIN_RECUR DESTINATION_RECUR LVL I
---------- ----------- ------------ ----------------- ----- -
11 12 11 12 1 N
12 13 12 13 2 N
14 11 11 14 1 N
14 15 14 15 2 N
23 15 15 23 3 N
23 24 23 24 4 N
24 11 24 11 5 N
11 12 11 12 6 N
12 13 12 13 7 N
14 11 11 14 6 Y
18 11 11 18 6 N
14 16 14 16 2 N
14 17 14 17 2 N
18 11 11 18 1 N
24 11 11 24 1 N
23 24 24 23 2 N
23 15 23 15 3 N
14 15 15 14 4 N
14 11 14 11 5 N
11 12 11 12 6 N
12 13 12 13 7 N
18 11 11 18 6 N
24 11 11 24 6 Y
14 16 14 16 5 N
14 17 14 17 5 N
The CYCLE clause creates a column IS_CYCLE that is Y when a cycle has been detected. As destination_recur is specified, it will be Y whenever a destination_recur is found that already exists as destination_recur in the same branch of the recursion. Therefore we get a lot of repeats in the output, because the new connection from 24 to 11 is not detected as a cycle, but when we continue again from 11 then we repeat a subbranch we have already visited and get a cycle.
We can improve on it by starting our recursion in a different way. We can create a fictional "level 0" which makes certain that when we go around and hit 11 again, it will be recognized as a cycle:
with net (origin, destination, origin_recur, destination_recur, lvl)
as (
select 0 origin
, 11 destination
, 0 origin_recur
, 11 destination_recur
, 0 lvl
from dual
union all
select n.origin
, n.destination
, net.destination_recur origin_recur
, case n.origin
when net.destination_recur then n.destination
else n.origin
end destination_recur
, net.lvl + 1 lvl
from net
join network_table n
on (n.origin = net.destination_recur and n.destination != net.origin_recur)
or (n.destination = net.destination_recur and n.origin != net.origin_recur)
)
search depth first by origin_recur, destination_recur set ordering
cycle destination_recur set is_cycle to 'Y' default 'N'
select origin
, destination
, origin_recur
, destination_recur
, lvl
, is_cycle
from net
order by ordering
/
Giving us a somewhat better output:
ORIGIN DESTINATION ORIGIN_RECUR DESTINATION_RECUR LVL I
---------- ----------- ------------ ----------------- ----- -
0 11 0 11 0 N
11 12 11 12 1 N
12 13 12 13 2 N
14 11 11 14 1 N
14 15 14 15 2 N
23 15 15 23 3 N
23 24 23 24 4 N
24 11 24 11 5 Y
14 16 14 16 2 N
14 17 14 17 2 N
18 11 11 18 1 N
24 11 11 24 1 N
23 24 24 23 2 N
23 15 23 15 3 N
14 15 15 14 4 N
14 11 14 11 5 Y
14 16 14 16 5 N
14 17 14 17 5 N
This time when it gets to 24->11, it declares a cycle and stops. But there remains another problem: we traverse the cycle in the graph twice, once in each direction. The CYCLE clause cannot help us here.
A simple solution can be to just filter away our fictional level 0 as well as the cycles, and then a plain DISTINCT will remove our duplicates:
with net (origin, destination, origin_recur, destination_recur, lvl)
as (
select 0 origin
, 11 destination
, 0 origin_recur
, 11 destination_recur
, 0 lvl
from dual
union all
select n.origin
, n.destination
, net.destination_recur origin_recur
, case n.origin
when net.destination_recur then n.destination
else n.origin
end destination_recur
, net.lvl + 1 lvl
from net
join network_table n
on (n.origin = net.destination_recur and n.destination != net.origin_recur)
or (n.destination = net.destination_recur and n.origin != net.origin_recur)
)
search depth first by origin_recur, destination_recur set ordering
cycle destination_recur set is_cycle to 'Y' default 'N'
select distinct
origin
, destination
from net
where is_cycle != 'Y'
and lvl != 0
/
This gives us the 10 connections of the network:
ORIGIN DESTINATION
---------- -----------
11 12
12 13
14 11
14 15
23 15
23 24
14 17
18 11
24 11
14 16
That approach may be OK for many purposes, but it is not guaranteed to be in the correct order. This time it looks like we were lucky, but we cannot be certain that for example a HASH distinct will keep the order for us.
So we can instead use the analytic function ROW_NUMBER to help us:
with net (origin, destination, origin_recur, destination_recur, lvl)
as (
select 0 origin
, 11 destination
, 0 origin_recur
, 11 destination_recur
, 0 lvl
from dual
union all
select n.origin
, n.destination
, net.destination_recur origin_recur
, case n.origin
when net.destination_recur then n.destination
else n.origin
end destination_recur
, net.lvl + 1 lvl
from net
join network_table n
on (n.origin = net.destination_recur and n.destination != net.origin_recur)
or (n.destination = net.destination_recur and n.origin != net.origin_recur)
)
search depth first by origin_recur, destination_recur set ordering
cycle destination_recur set is_cycle to 'Y' default 'N'
select origin
, destination
, origin_recur
, destination_recur
, lvl
, is_cycle
, row_number() over (
partition by origin, destination
order by ordering
) rn
, ordering
from net
order by ordering
/
The result will be that the first occurrence in order of ORDERING of a given (origin, destination) pair will be numbered 1, subsequent occurrences numbered 2,...:
ORIGIN DESTINATION ORIGIN_RECUR DESTINATION_RECUR LVL I RN ORDERING
---------- ----------- ------------ ----------------- ----- - --- ----------
0 11 0 11 0 N 1 1
11 12 11 12 1 N 1 2
12 13 12 13 2 N 1 3
14 11 11 14 1 N 1 4
14 15 14 15 2 N 1 5
23 15 15 23 3 N 1 6
23 24 23 24 4 N 1 7
24 11 24 11 5 Y 1 8
14 16 14 16 2 N 1 9
14 17 14 17 2 N 1 10
18 11 11 18 1 N 1 11
24 11 11 24 1 N 2 12
23 24 24 23 2 N 2 13
23 15 23 15 3 N 2 14
14 15 15 14 4 N 2 15
14 11 14 11 5 Y 2 16
14 16 14 16 5 N 2 17
14 17 14 17 5 N 2 18
With this approach the "reverse" traversal of the cycle in the graph gets RN=2, so we can filter those away:
with net (origin, destination, origin_recur, destination_recur, lvl)
as (
select 0 origin
, 11 destination
, 0 origin_recur
, 11 destination_recur
, 0 lvl
from dual
union all
select n.origin
, n.destination
, net.destination_recur origin_recur
, case n.origin
when net.destination_recur then n.destination
else n.origin
end destination_recur
, net.lvl + 1 lvl
from net
join network_table n
on (n.origin = net.destination_recur and n.destination != net.origin_recur)
or (n.destination = net.destination_recur and n.origin != net.origin_recur)
)
search depth first by origin_recur, destination_recur set ordering
cycle destination_recur set is_cycle to 'Y' default 'N'
select *
from (
select origin
, destination
, origin_recur
, destination_recur
, lvl
, is_cycle
, row_number() over (
partition by origin, destination
order by ordering
) rn
, ordering
from net
)
where rn = 1
and lvl != 0
order by ordering
/
Giving us an output of the desired 10 rows:
ORIGIN DESTINATION ORIGIN_RECUR DESTINATION_RECUR LVL I RN ORDERING
---------- ----------- ------------ ----------------- ----- - --- ----------
11 12 11 12 1 N 1 2
12 13 12 13 2 N 1 3
14 11 11 14 1 N 1 4
14 15 14 15 2 N 1 5
23 15 15 23 3 N 1 6
23 24 23 24 4 N 1 7
24 11 24 11 5 Y 1 8
14 16 14 16 2 N 1 9
14 17 14 17 2 N 1 10
18 11 11 18 1 N 1 11
Using this method we actually could use the original recursion rather than the one with a fictional level 0:
with net (origin, destination, origin_recur, destination_recur, lvl)
as (
select n.origin
, n.destination
, 11 origin_recur
, case n.origin
when 11 then destination
else origin
end destination_recur
, 1 lvl
from network_table n
where n.origin = 11
or n.destination = 11
union all
select n.origin
, n.destination
, net.destination_recur origin_recur
, case n.origin
when net.destination_recur then n.destination
else n.origin
end destination_recur
, net.lvl + 1 lvl
from net
join network_table n
on (n.origin = net.destination_recur and n.destination != net.origin_recur)
or (n.destination = net.destination_recur and n.origin != net.origin_recur)
)
search depth first by origin_recur, destination_recur set ordering
cycle destination_recur set is_cycle to 'Y' default 'N'
select *
from (
select origin
, destination
, origin_recur
, destination_recur
, lvl
, is_cycle
, row_number() over (
partition by origin, destination
order by ordering
) rn
, ordering
from net
)
where rn = 1
order by ordering
/
It gives us the same rows as before:
ORIGIN DESTINATION ORIGIN_RECUR DESTINATION_RECUR LVL I RN ORDERING
---------- ----------- ------------ ----------------- ----- - --- ----------
11 12 11 12 1 N 1 1
12 13 12 13 2 N 1 2
14 11 11 14 1 N 1 3
14 15 14 15 2 N 1 4
23 15 15 23 3 N 1 5
23 24 23 24 4 N 1 6
24 11 24 11 5 N 1 7
18 11 11 18 6 N 1 11
14 16 14 16 2 N 1 12
14 17 14 17 2 N 1 13
But we do notice a slight problem: The 18->11 connection is shown here to level 6 where it is more correct to call it level 1.
So we stick to the solution with "level 0" giving us this final query stripped of superfluous information:
with net (origin, destination, origin_recur, destination_recur, lvl)
as (
select 0 origin
, 11 destination
, 0 origin_recur
, 11 destination_recur
, 0 lvl
from dual
union all
select n.origin
, n.destination
, net.destination_recur origin_recur
, case n.origin
when net.destination_recur then n.destination
else n.origin
end destination_recur
, net.lvl + 1 lvl
from net
join network_table n
on (n.origin = net.destination_recur and n.destination != net.origin_recur)
or (n.destination = net.destination_recur and n.origin != net.origin_recur)
)
search depth first by origin_recur, destination_recur set ordering
cycle destination_recur set is_cycle to 'Y' default 'N'
select origin
, destination
, lvl
from (
select origin
, destination
, lvl
, row_number() over (
partition by origin, destination
order by ordering
) rn
, ordering
from net
)
where rn = 1
and lvl != 0
order by ordering
/
Giving this final output:
ORIGIN DESTINATION LVL
---------- ----------- -----
11 12 1
12 13 2
14 11 1
14 15 2
23 15 3
23 24 4
24 11 5
14 16 2
14 17 2
18 11 1
All in all a fun challenge and definitely a case for me to understand the differences of using a recursive subquery rather than a connect by hierarcical query :-) .


Kim,
ReplyDeletevery nice !
We discussed a similar, but slightly simpler problem on
http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:489772591421#5496106000346987396
Matthias