-
Notifications
You must be signed in to change notification settings - Fork 12
/
LOGFORMAT
121 lines (100 loc) · 6.45 KB
/
LOGFORMAT
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
FLAGS
To activate logging mode, use "-m <flags>"
Flags:
- m: mapping
- t: timestamp
- W: worker events
- T: task events
- S: stream traces (only with task events)
- M: message traces (only with task events)
- L: load information (total computational time, waiting time, waiting count of each worker). Note that if flag W is set, the load information can be induced from worker events.
- A: all flags are set
=======================================
FORMAT
There are two kinds of monitoring files: map files and log files.
All files start with the version of log format --> easy to keep track of
Old version: 2.1 (since 10/01/2012)
Current version: 2.2 (since 05/03/2012)
========= DECEN LPEL =========
Map files
---------
Map files records the mapping of all tasks among workers. Each entry descripes the mapping of one task. The format of mapping entries is defined as following:
<task-id> [ NET-PATH ] <space> <box-name> <space> <worker-id><end-character>
NET-PATH: {:POS} *
POS: S<num> ... position within serial composition (indizes start with 1)
P ... border line of parallel composition (split/collect)
P<num> ... position within parallel compostion, i.e. index of the branch (indizes start with 0)
R ... border line of serial replication (star)
R<num> ... index of the instance generated by serial replication
I ... border line of parallel replication (split/collect)
I<num> ... postion in parallel replication, i.e. index of the branch (this is also the tag value)
<task-id> ... identification of the task, used to look up its internal behaviours in log files
<box-name> ... the S-Net box name, or the type for an implicit entity
<worker-id> ... starts with 0; -1 indicates global threads for operating inputs and outputs
<space> ... space character
<end-character> ... '#' character
Log files
---------
Each worker has one log file recording scheduling events. Information of each event is represented by one entry in the log file. Each entry ends with "#". (This is because the efficiency of the monitoring framework. To be read easily, "#" can be replaced by "\n")
There are 2 kinds of events: worker events, task events.
- Worker events:
+ Worker started: Entry = <timestamp> <start-character> <end-character>
+ Worker waited: Entry = <timestamp> <wait-character> <waiting time> <end-character>
+ Worker ended: Entry = <timestamp> <end-character> <end-character>
<start-character> ... 'S' character
<wait-character> ... 'W' character
<end-character> ... 'E' character
- Task events:
+ Task blocked: Entry = <timestamp> <blocked-by> <task-id> <space> <execution-time> <space> <stream-trace> <message-trace> <end-character>
+ Task ended: Entry = <timestamp> <zoombie> <task-id> <space> <execution-time> <space> <create-time> <stream-trace> <message-trace> <end-character>
<blocked-by> ... 'I' = task is blocked by input; 'O' = task blocked by output, 'A' = blocked by any (in poll operation)
<execution-time> ... execution time since the last dispatch
<zoombie> ... 'Z' character
- Stream trace: information of streams on which the task operates
+ <stream-trace> = {<stream-entry>}*
+ <stream-entry> = <stream-id> <mode> <state> <#items> <flags>
<mode> ... 'r = task reads from stream; 'w' = task writes to stream
<state> ... 'O' = open, 'C' = closed, 'I' = in used
<#items>: number of messages read from/written to stream (depending on the mode) during the task dispatch
<flags>: Activity fags. If no fag is set, the pattern is '--- '. The first flag ('? ' if set) indicates that the task is blocked on that stream. The second flag '! ' indicates, that reading/writing (dependent on mode) unblocked the task on the other side of the stream.The third flag '* ' indicates that items have been read/written to the stream.
- Message trace: information of messages which task consumes/produces
+ <message-trace> = {<message-entry> <semicolon>}*
+ <message-entry> = <timestamp> <io> <message-id><S-character><size>
+ <message-id> = <node-id> <dot-character> <local-id>
<timestamp> ... time at which the message is consumed/produced
<io>: 'I' = message is an input (consumed by the task), 'O' message is an output (produced by the task)
<node-id> ... id of the distributed node
<local-id> ... message id locally on the distributed node
<dot-character> ... '.' character
<semicolon> ... ';' character
<size> ... size of the message
<S-character> ... 'S' character
- Note on Message Size:
+ the information in message trace is only accurate for data created by SNet memory utilities (e.g. C4SNetAlloc)
+ for data, which is managed by the user and passed to SNet as a pointer, the message size is simply the size of the pointer
+ in distributed S-Net, the mechanism to achieve message is more complicated. Since SNet only pass the reference of data, and the real
data is actually sent when requested. This mechanism helps the effiecency. However, it invalidates the message size in the log files.
These messages that sent via LPEL are just the reference. And the real data is sent directly by Distributed SNet.
(Communication cost in distributed S-Net can be achieved, see Section DISTRIBUTED SNET)
- Load Information:
+ two entries for two worker events: worker started and worker ended --> provide the total running time of the worker (including computational time and waiting time)
+ WC<waiting count>WT<total waiting time>
======================================================================
========= HRC LPEL =========
Tasks are not mapped specifically to any worker.
Log entries are same as DECEN LPEL except for following points:
- Map entry: there is no worker_it
<task-id> [ NET-PATH ] <space> <box-name> <space><end-character>
- Worker entry:
+ Worker waited: Entry = <timestamp> <wait-character> <waiting time> <end-character>
Waiting time: is the time period from when worker send request to the master to a task is assigned.
======================================================================
========= DISTRIBUTED SNET =========
- The communcation cost among computational nodes can be achieved by using flag "-logComm"
- For each node, one log file "n[node_id]_comm.log" is created
- The file consist multiple entries, each entry indicate a message that current node send to others
- The entry format is: <node_id> <size> <semicolon>
+ <node_id>: the receiver node
+ <size>: the size of the message
+ <semicolon>: ... ';' character
+ meaning: the current node sent 1 message of <size> bytes to <node_id>