nrouyer/solving_issues.txt

## solving_issues.txt
= UML - Solving issues in software design
:neo4j-version: 2.3.0
:author: Nicolas Rouyer
:toc: right
:twitter: @rrrouyer
:description: Graph database in software design, UML, activity diagram
:tags: domain:software, use-case:solving-issues

This interactive Neo4j graph tutorial deals with solving issues in software design, starting from an UML activity diagram.

'''

:toc: left

'''

[[introduction]]
== Graph Databases and UML : Solving issues in software design


My original question was "*Can we modelize UML diagrams in neo4j graph database and what can we learn from that ?*".
I've been working for some times as an IT architect, thus modelizing quite a few information systems, using well-known conception language UML : Unified Modeling language. One can create a lot of diagrams with UML, each of them being *a particular point of view* on the system.

Considering some of these diagrams, I was saying to myself that they were indeed graphs, that we could certainly establish some correspondance between UML and graph databases, and use graph traversals to learn more on IT system design. In order to validate this intuition, I first selected an activity diagram that shows how to resolve an issue in a software design.

image::http://www.uml-diagrams.org/examples/activity-examples-resolve-issue.png[UML activity diagram example]

Seems like a graph, doesn't it ? Shouldn't we create this "solving issue" graph ? +
Let's go and bring some cypher code to build the graph...

[[graph_creation]]
=== Creating UML graph data
[source,cypher]
----
// create nodes
CREATE (beginning:BEGINNING {name:"Start"})
CREATE (choice1:CHOICE {name:"Choice 1"})
CREATE (choice2:CHOICE {name:"Choice 2"})
CREATE (choice3:CHOICE {name:"Choice 3"})
CREATE (choice4:CHOICE {name:"Choice 4"})
CREATE (choice5:CHOICE {name:"Choice 5"})
CREATE (choice6:CHOICE {name:"Choice 6"})
CREATE (action1:ACTION {name:"Create ticket"})
CREATE (action2:ACTION {name:"Reproduce issue"})
CREATE (action3:ACTION {name:"Update ticket"})
CREATE (action4:ACTION {name:"Identify issue"})
CREATE (action5:ACTION {name:"Determine resolution"})
CREATE (action6:ACTION {name:"Fix issue"})
CREATE (action7:ACTION {name:"Verify fix"})
CREATE (action8:ACTION {name:"Close ticket"})
CREATE (end:END {name:"End"})
// create relationships
CREATE (beginning)-[:NEXT]->(action1)
CREATE (action1)-[:NEXT]->(choice1)
CREATE (choice1)-[:NEXT]->(action2)
CREATE (action2)-[:NEXT]->(choice2)
CREATE (choice2)-[:NEXT {description:"Cannot reproduce"}]->(choice3)
CREATE (choice2)-[:NEXT {description:"issue reproduced"}]->(choice4)
CREATE (choice3)-[:NEXT]->(action3)
CREATE (action3)-[:NEXT]->(choice1)
CREATE (choice4)-[:NEXT]->(action4)
CREATE (action4)-[:NEXT]->(choice5)
CREATE (choice5)-[:NEXT {description:"known issue"}]->(choice3)
CREATE (choice5)-[:NEXT {description:"new issue"}]->(action5)
CREATE (action5)-[:NEXT]->(action6)
CREATE (action6)-[:NEXT]->(action7)
CREATE (action7)-[:NEXT]->(choice6)
CREATE (choice6)-[:NEXT {description:"issue not resolved"}]->(choice4)
CREATE (choice6)-[:NEXT {description:"issue resolved"}]->(action8)
CREATE (action8)-[:NEXT]->(end)
----
Graph data loaded !


[[graph_consultation]]
=== First graph query : path to bug resolution

Once our solving issue graph is created, let us query it simply, as the first question we ask ourselves is : +
*What is the nominal path from bug creation to resolution ?* +
The following Cypher query gets the answer :

[source,cypher]
----
// finding paths from bug creation to bug resolution
MATCH paths=(beginning:BEGINNING {name:"Start"})-[*]->(end:END {name:"End"})
WITH nodes(paths) AS steps, paths
RETURN EXTRACT(step IN steps | step.name) AS Paths,
LENGTH(paths) AS Path_Length
----

'''

=== Refining graph query : hiding choice nodes

Choice nodes are not that meaningful, if we want to hide them in the results, Cypher request is modified as follows :

[source,cypher]
----
// finding all paths from bug creation to bug resolution, hiding choice nodes
MATCH paths=(beginning:BEGINNING {name:"Start"})-[*]->(end:END {name:"End"})
RETURN [step in nodes(paths) WHERE NOT step:CHOICE | step.name] AS Paths
----

'''

[[graph_refining]]
=== Refining the graph to observe Time To Resolution (TTR)

Let us now try to understand how some events in the whole resolution process have an impact on *TTR (Time To Resolution)* indicator. +
For example, *how much time is lost when a bug cannot be reproduced at first time ?* +
For a better understanding, let us consider that every transitions in the bug resolution graph are equivalent, and that each transition takes 1 hour to complete

[source,cypher]
----
// adding work time for each transition in bug resolution process
MATCH ()-[r:NEXT]-() SET r.minutes=60
----

To discover the impact of a non-reproducible bug on the entire bug resolution chain, let us first compute the time taken to pass through this additional loop : +

image::https://github.com/nrouyer/images/blob/master/uml_zoom.png?raw=true[UML Zoom on bug reproduction]

I can't reproduce the problem (1 hour) +
+ I update the ticket (1 hour) +
+ I choose to try and reproduce the problem (1 hour) +
+ I try to reproduce the problem (1 hour) +
+ I observe if I reproduced the problem (1 hour) +
This tedious approach (!) leads us to 5 additional hours... +

Let us confirm this with Cypher :
[source,cypher]
----
// What is the time spent in the bug reproduction loop ?
MATCH p=(choice2:CHOICE {name:"Choice 2"})-[r:NEXT*5]->(choice2)
RETURN reduce(totalTime = 0, rel IN relationships(p)| totalTime + rel.minutes) AS total
----

Okay, now let us compute the nominal resolution process duration :
[source,cypher]
----
// What is the time spent in the bug resolution nominal path ?
MATCH nominal_path=(beginning:BEGINNING {name:"Start"})-[*]->(end:END {name:"End"})
RETURN reduce(totalTime = 0, rel IN relationships(nominal_path)| totalTime + rel.minutes) AS total
----

Finally we can deduce the percentage ratio between issue reproduction loop and nominal resolution path :
[source,cypher]
----
// ratio between reproduction loop and nominal path
MATCH p=(choice2:CHOICE {name:"Choice 2"})-[r:NEXT*5]->(choice2)
WITH reduce(totalTime = 0, rel IN relationships(p)| totalTime + rel.minutes) AS total_reproduction
MATCH nominal_path=(beginning:BEGINNING {name:"Start"})-[*]->(end:END {name:"End"})
WITH reduce(totalTime2 = 0, relas IN relationships(nominal_path)| totalTime2 + relas.minutes) AS total_nominal, total_reproduction
RETURN total_reproduction/(total_nominal*1.0)*100 AS ratio
----

'''

[[conclusion]]
=== An Appetizer ?
This first graphgist may only be an appetizer for bug resolution with graphs, but also for *UML and graphs*. +
I hope I will have the time in the months to come to share other insights on these domains, with the help of the graphistas' community :-) + +
Please enjoy and post your remarks: +
mailto:rouyer.nicolas@gmail.com>[Nicolas ROUYER]
	= UML - Solving issues in software design
	:neo4j-version: 2.3.0
	:author: Nicolas Rouyer
	:toc: right
	:twitter: @rrrouyer
	:description: Graph database in software design, UML, activity diagram
	:tags: domain:software, use-case:solving-issues

	This interactive Neo4j graph tutorial deals with solving issues in software design, starting from an UML activity diagram.

	'''

	:toc: left

	'''

	[[introduction]]
	== Graph Databases and UML : Solving issues in software design


	My original question was "Can we modelize UML diagrams in neo4j graph database and what can we learn from that ?".
	I've been working for some times as an IT architect, thus modelizing quite a few information systems, using well-known conception language UML : Unified Modeling language. One can create a lot of diagrams with UML, each of them being a particular point of view on the system.

	Considering some of these diagrams, I was saying to myself that they were indeed graphs, that we could certainly establish some correspondance between UML and graph databases, and use graph traversals to learn more on IT system design. In order to validate this intuition, I first selected an activity diagram that shows how to resolve an issue in a software design.

	image::http://www.uml-diagrams.org/examples/activity-examples-resolve-issue.png[UML activity diagram example]

	Seems like a graph, doesn't it ? Shouldn't we create this "solving issue" graph ? +
	Let's go and bring some cypher code to build the graph...

	[[graph_creation]]
	=== Creating UML graph data
	[source,cypher]
	----
	// create nodes
	CREATE (beginning:BEGINNING {name:"Start"})
	CREATE (choice1:CHOICE {name:"Choice 1"})
	CREATE (choice2:CHOICE {name:"Choice 2"})
	CREATE (choice3:CHOICE {name:"Choice 3"})
	CREATE (choice4:CHOICE {name:"Choice 4"})
	CREATE (choice5:CHOICE {name:"Choice 5"})
	CREATE (choice6:CHOICE {name:"Choice 6"})
	CREATE (action1:ACTION {name:"Create ticket"})
	CREATE (action2:ACTION {name:"Reproduce issue"})
	CREATE (action3:ACTION {name:"Update ticket"})
	CREATE (action4:ACTION {name:"Identify issue"})
	CREATE (action5:ACTION {name:"Determine resolution"})
	CREATE (action6:ACTION {name:"Fix issue"})
	CREATE (action7:ACTION {name:"Verify fix"})
	CREATE (action8:ACTION {name:"Close ticket"})
	CREATE (end:END {name:"End"})
	// create relationships
	CREATE (beginning)-[:NEXT]->(action1)
	CREATE (action1)-[:NEXT]->(choice1)
	CREATE (choice1)-[:NEXT]->(action2)
	CREATE (action2)-[:NEXT]->(choice2)
	CREATE (choice2)-[:NEXT {description:"Cannot reproduce"}]->(choice3)
	CREATE (choice2)-[:NEXT {description:"issue reproduced"}]->(choice4)
	CREATE (choice3)-[:NEXT]->(action3)
	CREATE (action3)-[:NEXT]->(choice1)
	CREATE (choice4)-[:NEXT]->(action4)
	CREATE (action4)-[:NEXT]->(choice5)
	CREATE (choice5)-[:NEXT {description:"known issue"}]->(choice3)
	CREATE (choice5)-[:NEXT {description:"new issue"}]->(action5)
	CREATE (action5)-[:NEXT]->(action6)
	CREATE (action6)-[:NEXT]->(action7)
	CREATE (action7)-[:NEXT]->(choice6)
	CREATE (choice6)-[:NEXT {description:"issue not resolved"}]->(choice4)
	CREATE (choice6)-[:NEXT {description:"issue resolved"}]->(action8)
	CREATE (action8)-[:NEXT]->(end)
	----
	Graph data loaded !


	[[graph_consultation]]
	=== First graph query : path to bug resolution

	Once our solving issue graph is created, let us query it simply, as the first question we ask ourselves is : +
	What is the nominal path from bug creation to resolution ? +
	The following Cypher query gets the answer :

	[source,cypher]
	----
	// finding paths from bug creation to bug resolution
	MATCH paths=(beginning:BEGINNING {name:"Start"})-[*]->(end:END {name:"End"})
	WITH nodes(paths) AS steps, paths
	RETURN EXTRACT(step IN steps \| step.name) AS Paths,
	LENGTH(paths) AS Path_Length
	----

	'''

	=== Refining graph query : hiding choice nodes

	Choice nodes are not that meaningful, if we want to hide them in the results, Cypher request is modified as follows :

	[source,cypher]
	----
	// finding all paths from bug creation to bug resolution, hiding choice nodes
	MATCH paths=(beginning:BEGINNING {name:"Start"})-[*]->(end:END {name:"End"})
	RETURN [step in nodes(paths) WHERE NOT step:CHOICE \| step.name] AS Paths
	----

	'''

	[[graph_refining]]
	=== Refining the graph to observe Time To Resolution (TTR)

	Let us now try to understand how some events in the whole resolution process have an impact on TTR (Time To Resolution) indicator. +
	For example, how much time is lost when a bug cannot be reproduced at first time ? +
	For a better understanding, let us consider that every transitions in the bug resolution graph are equivalent, and that each transition takes 1 hour to complete

	[source,cypher]
	----
	// adding work time for each transition in bug resolution process
	MATCH ()-[r:NEXT]-() SET r.minutes=60
	----

	To discover the impact of a non-reproducible bug on the entire bug resolution chain, let us first compute the time taken to pass through this additional loop : +

	image::https://github.com/nrouyer/images/blob/master/uml_zoom.png?raw=true[UML Zoom on bug reproduction]

	I can't reproduce the problem (1 hour) +
	+ I update the ticket (1 hour) +
	+ I choose to try and reproduce the problem (1 hour) +
	+ I try to reproduce the problem (1 hour) +
	+ I observe if I reproduced the problem (1 hour) +
	This tedious approach (!) leads us to 5 additional hours... +

	Let us confirm this with Cypher :
	[source,cypher]
	----
	// What is the time spent in the bug reproduction loop ?
	MATCH p=(choice2:CHOICE {name:"Choice 2"})-[r:NEXT*5]->(choice2)
	RETURN reduce(totalTime = 0, rel IN relationships(p)\| totalTime + rel.minutes) AS total
	----

	Okay, now let us compute the nominal resolution process duration :
	[source,cypher]
	----
	// What is the time spent in the bug resolution nominal path ?
	MATCH nominal_path=(beginning:BEGINNING {name:"Start"})-[*]->(end:END {name:"End"})
	RETURN reduce(totalTime = 0, rel IN relationships(nominal_path)\| totalTime + rel.minutes) AS total
	----

	Finally we can deduce the percentage ratio between issue reproduction loop and nominal resolution path :
	[source,cypher]
	----
	// ratio between reproduction loop and nominal path
	MATCH p=(choice2:CHOICE {name:"Choice 2"})-[r:NEXT*5]->(choice2)
	WITH reduce(totalTime = 0, rel IN relationships(p)\| totalTime + rel.minutes) AS total_reproduction
	MATCH nominal_path=(beginning:BEGINNING {name:"Start"})-[*]->(end:END {name:"End"})
	WITH reduce(totalTime2 = 0, relas IN relationships(nominal_path)\| totalTime2 + relas.minutes) AS total_nominal, total_reproduction
	RETURN total_reproduction/(total_nominal1.0)100 AS ratio
	----

	'''

	[[conclusion]]
	=== An Appetizer ?
	This first graphgist may only be an appetizer for bug resolution with graphs, but also for UML and graphs. +
	I hope I will have the time in the months to come to share other insights on these domains, with the help of the graphistas' community :-) + +
	Please enjoy and post your remarks: +
	mailto:rouyer.nicolas@gmail.com>[Nicolas ROUYER]