y-okumura-isp/ROS2_rt_measurement_comparison_table.md

## ROS2_rt_measurement_comparison_table.md

      
    Raw
  

              ROS2_rt_measurement_comparison_table.md
            
          
    We are studying existing measurement projects especially in terms of real-time.

[1] CI: buildfarm_perf_tests + Apex.AI performance_test (forked from https://gitlab.com/ApexAI/performance_test)

results are aveilable at http://build.ros2.org/job/Fci__nightly-performance_ubuntu_focal_amd64/
This has 3 tests. We refer only Test1 and Test2 because they are for communication measurement(separate columns in the table below).


[2] iRobot ros2-performance

result for crystal: https://github.com/irobot-ros/ros2-performance/tree/master/performances/experiments/crystal
This can test a more complex situation than [1]. We can specify a network topology, the number of nodes and topics, and so on.
The sample topology "sierra-nevada" has 10 nodes, 13 topics, and more publishers and subscribers. We refer to this as "SN" in [2] column.


[3] pendulum_control

slides: https://roscon.ros.org/2015/presentations/RealtimeROS2.pdf


We have created 3 tables: one table for function comparison and two tables for metrics as described below.

function comparison table
metrics comparison table (behavior)
metrics comparison table (resource)

We plan to post an article in ROS discourse about which policy is preferred for each item.
function comparison table

Roughly speaking, ROS2 program has following layers.
+----------------------------+
|  Publisher / Subscription  |
|  rclcpp(Executor/Nodes)    |
|  DDS                       |   ROS2 layer
+----------------------------+

+----------------------------+
|  Process and RT-setting    |
+----------------------------+

+----------------------------+
|  HW / OS                   |
+----------------------------+

We separete table into some categires and subcategories according this.


No
Category
Subcategory
name
[1] Test1
[1] Test2
[2]
[3]


1
HW/OS
kernel
RT_PREEMPTIVE patch
-
-
-
O


2

kernel thread
adjust CPU Core
-
-
-
-


3
Process
common
duration
30[s] ("--max_runtime 30" specified)
<-
5 [s] (default)
1000 or 7000 [s] (1M-7M cycles, 1kHz)


4


# of process
1
2
1 process/1 json file
1 for Realtime, 2 for non-Realtime


5


# of thread
2 thread(main for statisics, child for pub/sub)
<-
option. We can separate threads per executors.
-


6
RT-setting
scheduling
scheduling policy
- (use_rt_prio does not look specified)
<-
CFS
SCHED_RR(but DDS threads are CFS)


7

CPU affinity
CPU affinity
- (use_rt_cpus does not look specified)
<-
CFS


8

memory
page fault guard
-
<
-
set by rttest_lock_and_prefault_dynamic()


9
DDS/RMW

suppored RMW
connext, cyclone, fastrtps_{cpp,dynamic}
<-
use RMW_IMPLEMENTATION
openslice & connext are in README


10


supported DDS(direct call)
Cyclone, FastRTPS
-
NA
NA


11


heterogeneous communicatoin
-
O(rmw_*)
undescribed(looks impossible because of 1 process)
undescribes(looks impossible)


12
rclcpp
init() option
(nothing)


13

Executor
Executor class
SingleThreadedExecutor
<-
StaticSingleThreadedExecutor
RttExecutor (loop by clock_nanosleep)


14

Node
# of Node
1
1 per process
10 in SN
2


15


use_intra_process_comms
ON
<-
option
OFF


16
Communication detail
Communication style
1-way/2-way
1way
2-way
more complex
controller and simulator


17

QoS
policy, depth
KeepALL. KeepLast(10) if topic >= 4mb
<-
KeepLast(10)
KeepLast(1)


18


Reliability
Best Effort
<-
Reliable
BestEffort


19


Durability
volatile
<-
volatile
volatile  # transient for setpoint


20

# of topics
# of topics


13 (SN)


21

data size pattern
data size pattern
1,4,16,32,64,512K, 1,2,4,8,8M
<-
8-250B, 1-600KB, 1,4,8 MB. almost -100 B(SN)
64bit val * 10 = 80 Byte (roughly)


22

Hz
Hz
1000
<-
2, 10, 100 (SN)
1000


23
Publisher / Timer
publisher
# of publishers
1
pub process only
many per node (over 10 in SN)
3


24

data
ptr_type
shared_ptr
<-
option (unique in default, shared in SN)
ConstSharedPtr


25


data allocation
allocated first
<-
allocate in each loop
sensor, command: instance val. logger: local val


26


internal api (borrow etc)
(unknown)
<-
(unknown)
(unknown)


27

Timer
periodic wakeup mechanism
by std::thread::sleep_for
<-
rclcpp::Node::create_wall_timer
clock_nanosleep(see Executor above)


28
Subscription / Callback
subscriber
# of subscribers
1
sub process only
many per node (over 10 in SN)
3


29


spin
spin_once, calcurates statistics in each loop
<-
Executor::spin
see Executor above


30

data
ptr_type
shared_ptr
<-
option. not specified so shared_ptr(NV)
ConstSharedPtr


31


data allocation(recv buf)
(unknown)
<-
(unknown)
MessagePoolMemoryStrategy


32


internal api (borrow etc)
(unknown)
<-
(unknown)
(unknown)


33
Other functions
Client/Server
has test?
-
<-
can select
-


34

Parameter
has test?
-
<-
-
-


35

Action
has test?
-
<-
-
-


36

Lifecycle
has test?
-
<-
x
-


37
Measument
measurement under discovery
measurement under discovery
ignore first 3 seconds
<-
discovery wait
-


38

system stress
stress tool
?
<-
?
by stress command


metrics comparison table (behavior)


No
Category

[1]
[2]
[3]


1
communication quality
total sent
Test12
O
-


2

total recv
Test12
O
O


3

losses
Test12
O
-


4

late/too_late
-
O
-


5

trip time stats
Test12
-
-


6
program latency
PDP/EDP discovery
-
O
-


7

timer jitter(nanosleep)
-
-
O


8

callback jitter
-
-
-


metrics comparison table (resource)


No
Category
name
CI(g1+g3)
g2
pendulum


1
CPU
CPU Usage
Test123
O
-


2
Memory
maxrss
Test12
-
-


3

Phy
Test 23
O
-


4

RES
Test 23
O(rss in resource)
-


5

Virt
Test 23
O(vsz in resource)
-


6

arena
-
O
-


7

in use
-
O
-


8

mmap
-
O
-


9
Page Fault
minor_pagefaults
-
-
O


10

major_pagefaults
-
-
O
No	Category	Subcategory	name	[1] Test1	[1] Test2	[2]	[3]
1	HW/OS	kernel	RT_PREEMPTIVE patch	-	-	-	O
2		kernel thread	adjust CPU Core	-	-	-	-
3	Process	common	duration	30[s] ("--max_runtime 30" specified)	<-	5 [s] (default)	1000 or 7000 [s] (1M-7M cycles, 1kHz)
4			# of process	1	2	1 process/1 json file	1 for Realtime, 2 for non-Realtime
5			# of thread	2 thread(main for statisics, child for pub/sub)	<-	option. We can separate threads per executors.	-
6	RT-setting	scheduling	scheduling policy	- (use_rt_prio does not look specified)	<-	CFS	SCHED_RR(but DDS threads are CFS)
7		CPU affinity	CPU affinity	- (use_rt_cpus does not look specified)	<-	CFS
8		memory	page fault guard	-	<	-	set by rttest_lock_and_prefault_dynamic()
9	DDS/RMW		suppored RMW	connext, cyclone, fastrtps_{cpp,dynamic}	<-	use RMW_IMPLEMENTATION	openslice & connext are in README
10			supported DDS(direct call)	Cyclone, FastRTPS	-	NA	NA
11			heterogeneous communicatoin	-	O(rmw_*)	undescribed(looks impossible because of 1 process)	undescribes(looks impossible)
12	rclcpp	init() option	(nothing)
13		Executor	Executor class	SingleThreadedExecutor	<-	StaticSingleThreadedExecutor	RttExecutor (loop by clock_nanosleep)
14		Node	# of Node	1	1 per process	10 in SN	2
15			use_intra_process_comms	ON	<-	option	OFF
16	Communication detail	Communication style	1-way/2-way	1way	2-way	more complex	controller and simulator
17		QoS	policy, depth	KeepALL. KeepLast(10) if topic >= 4mb	<-	KeepLast(10)	KeepLast(1)
18			Reliability	Best Effort	<-	Reliable	BestEffort
19			Durability	volatile	<-	volatile	volatile # transient for setpoint
20		# of topics	# of topics			13 (SN)
21		data size pattern	data size pattern	1,4,16,32,64,512K, 1,2,4,8,8M	<-	8-250B, 1-600KB, 1,4,8 MB. almost -100 B(SN)	64bit val * 10 = 80 Byte (roughly)
22		Hz	Hz	1000	<-	2, 10, 100 (SN)	1000
23	Publisher / Timer	publisher	# of publishers	1	pub process only	many per node (over 10 in SN)	3
24		data	ptr_type	shared_ptr	<-	option (unique in default, shared in SN)	ConstSharedPtr
25			data allocation	allocated first	<-	allocate in each loop	sensor, command: instance val. logger: local val
26			internal api (borrow etc)	(unknown)	<-	(unknown)	(unknown)
27		Timer	periodic wakeup mechanism	by `std::thread::sleep_for`	<-	rclcpp::Node::create_wall_timer	clock_nanosleep(see Executor above)
28	Subscription / Callback	subscriber	# of subscribers	1	sub process only	many per node (over 10 in SN)	3
29			spin	spin_once, calcurates statistics in each loop	<-	Executor::spin	see Executor above
30		data	ptr_type	shared_ptr	<-	option. not specified so shared_ptr(NV)	ConstSharedPtr
31			data allocation(recv buf)	(unknown)	<-	(unknown)	MessagePoolMemoryStrategy
32			internal api (borrow etc)	(unknown)	<-	(unknown)	(unknown)
33	Other functions	Client/Server	has test?	-	<-	can select	-
34		Parameter	has test?	-	<-	-	-
35		Action	has test?	-	<-	-	-
36		Lifecycle	has test?	-	<-	x	-
37	Measument	measurement under discovery	measurement under discovery	ignore first 3 seconds	<-	discovery wait	-
38		system stress	stress tool	?	<-	?	by `stress` command
No	Category		[1]	[2]	[3]
1	communication quality	total sent	Test12	O	-
2		total recv	Test12	O	O
3		losses	Test12	O	-
4		late/too_late	-	O	-
5		trip time stats	Test12	-	-
6	program latency	PDP/EDP discovery	-	O	-
7		timer jitter(nanosleep)	-	-	O
8		callback jitter	-	-	-
No	Category	name	CI(g1+g3)	g2	pendulum
1	CPU	CPU Usage	Test123	O	-
2	Memory	maxrss	Test12	-	-
3		Phy	Test 23	O	-
4		RES	Test 23	O(rss in resource)	-
5		Virt	Test 23	O(vsz in resource)	-
6		arena	-	O	-
7		in use	-	O	-
8		mmap	-	O	-
9	Page Fault	minor_pagefaults	-	-	O
10		major_pagefaults	-	-	O