QtCS2017 Discuss Qt Logging enhancements: Difference between revisions

Latest revision as of 17:22, 30 October 2017

Discussion

Structured logging:
- We want to extract information from logs and detect common messages. For example:

From	To
Got message "Hello"	Got message "%s" Hello
Got message "World"	Got message "%s" World

- We want to allow logging to store ancillary data, possibly in machine-readable format, likely not part of the message itself
  - Like journald API can do
  - How do we allow different formatting depending on backend? Needs to support formatting user types too.
Structured output
- Store to databases, for example
- Select backend with environment variable

Need to review with LTTNG and ETW tracing

ML discussion: http://lists.qt-project.org/pipermail/development/2017-October/031047.html

Summary

Qt Logging lacks regarding 2 different use cases:

1/ High-performance code (say HW handling, big data processing) needs a way to inject a few structured tracepoints to generate large binary traces (say, for profiling)

2/ Existing tracepoints (think about nb of qDebug in Qt user code) should generate more structured traces to:

facilitate exploration and diagnostic, and
allow immediate use of analysis tools like Python

Service and Device developers need #1. Medical Devices would benefit from #2 for post-market surveillance. Device designers operating in an uncontrolled environment would probably benefit from #2 as well

Prototypes

For #1: LTTNG and ETW tracing use external description of Statically-Defined Tracepoints (SDT) to generate code for LTTNG and ETW backends (or no code at all)
For #2: Qt-based modmedLog library (presentation, forge) uses Bind<T> and IData implementations to bind tracepoint data to JSON/XML/Console outputs and possibly read them back for replay

Conclusion

We should sit down and see how these 2 use cases intersect and the best things that are provided by each prototype

Requirements

For #1:

as little generated code as possible at the tracepoint:
- no binary or numeric formatting
- completely removable from release code (not just bypassing like QNoDebug::operator<<)

For #2:

source compatibility with qDebug tracepoints
- do not require non trivial changes such as changing from qDebug()<< to qDebug(...)
- allow changing qDebug() to qTrace() or whatever to provide choice of old vs structured tracing
- extensible to any user data type
offer structured trace formats (TSV+JSON, XML, sqlite?) to avoid low-level parsing

Questions

Do we still obey QT_MESSAGE_PATTERN? YES and it could be extended with, say, %{format} %{arg0} ...
Can we generate efficient tracepoint code without an external tracepoint description and tool?
Should we provide a format+args handler delegating by default to the existing handler?

(*QtMessageHandler2)(...,const QString & format, const QStringList & args);

Should we provide a formatting-independant handler based on Bind<T>:

(*QtMessageHandler3)(...,const QString & format, const QList<BindableRef> & args);

What about modmedLog performance?
When variadic macros and templates are available, can they be used to get the best of both prototypes?

Tracing/Logging survey

Tracepoint kinds

Dynamically-Defined

Data layout built at run-time (usually text format but binary is possible too) :

qDebug()<<... (QString msg assembled by QDebug)
LTTng tracef(format,...) (null-terminated string formatted by printf)
sd_journal_send(format,...)
MacOS os_log(level,format,....)
ETW TraceEvent(...,buffer) (buffer containing assembled data and metadata)

PROS: allows any kind of event data (nullable, dynamically-sized structures and arrays, variants, nested data)

CONS: prohibits bare metal performance (runtime function calls for assembling event data, data structure must be explicitely repeated for all event occurences)

USAGE: maximum acceptable effort for most app developer traces (see rsyslog author's remark on developers still not identifying their events 15+years after MS event log offered the possibility: http://blog.gerhards.net/2011/11/what-i-dont-like-about-journald.html )

Statically-Defined

(data layout fixed at compile-time)

External definition
- ETW manifest for uniquely identified events
- Proposed Qt tool https://codereview.qt-project.org/#/c/185287/
Macro-based assembling
- MS TraceLoggingWrite() and related macros
- LTTng tracepoint()

PROS: allows bare metal performance (no heap allocation, fixed instruction pipeline, data structure removed from event occurences)

CONS:

prohibits or complicates some dynamic event data (nullable, dynamically-sized structures and arrays, variants, nested data)
app developers will generally not take the burden to generate an UUID, add a prebuild step to their build and describe trace points data when they can just append it to a generic tracepoint (see above)

USAGE: system troubleshooting for sysadmin

Probes

(no-ops that can be dynamically changed to collect data with DTrace or similar tools)

DTrace (on MacOS for instance)
Linux bcc/eBPF

PROS: allows many live analysis with limited performance impact (no impact when unused)

CONS: probe safety is a concern (solved by DTrace in general but still undesirable for certified software IMO)

USAGE: performance analysis graal for experts only (DTrace author admits most DTrace users just run existing D scripts)

Tracepoint data formats

Binary

LTTng CTF (public, though it remains unclear how to pass dynamically-encoded CTF to LTTng library)
MS EtwSelfDescribingFormat (private, though TraceLogging macros may be used to understand it)
CBOR (https://tools.ietf.org/html/rfc7049, https://www.iana.org/assignments/cbor-tags/cbor-tags.xhtml)

PROS: write performance mainly depends on memory copies, sometime on memory allocations and function calls

CONS: difficult choice between interoperability and native toolsets

Text

printf(format,...)
qDebug(format,...)
qDebug()<<...
LTTng tracef(format,...)

CONS: write performance impaired by encoding of numbers, QString, QByteArray

PROS: interoperability and evolutions facilitated

Configurable

modmedLog LogOutput + IData
log4xx
...

Trace backends

File

PROS: interoperability, simple toolset

CONS: unknown throughput, lacks basic logging features (log rotation)

MS ETW / LTTng

PROS: best throughput, async IO, unlogged events accessible from crash dump

CONS: specialised and platform-dependent toolset unknown from most app developers

MS EventViewer / Journald / syslog

PROS: good throughput

CONS: platform-dependent toolset

Configurable

QtMessageHandler
modmedLog LogOutput
log4xx
...

@@ Line 21: / Line 21: @@
 * Structured output
 ** Store to databases, for example
-** Do we still obey QT_MESSAGE_PATTERN?
 ** Select backend with environment variable
 Need to review with [https://codereview.qt-project.org/185287 LTTNG and ETW tracing]
-== Welcome text ==
+ML discussion: http://lists.qt-project.org/pipermail/development/2017-October/031047.html
-Qt-based devices such as Medical Cyber-Physical Systems could be developed, verified and validated more efficiently using structured logs. We suggest Qt Logging enhancements that would facilitate both human exploration and automated analysis with Python or other tools. The session will allow:
+== Summary ==
-* gathering your feedback
-* examining which enhancements would be useful to Qt
-and how to contribute them...
-==== Use Cases ====
+Qt Logging lacks regarding 2 different use cases:
-help development
-help fixing bugs
+/ High-performance code (say HW handling, big data processing) needs a way to inject a few structured tracepoints to generate large binary traces (say, for profiling)
-...
+/ Existing tracepoints (think about nb of qDebug in Qt user code) should generate more structured traces to:
+* facilitate exploration and diagnostic, and
+* allow immediate use of analysis tools like Python
-==== Enhancements? ====
+Service and Device developers need #1. Medical Devices would benefit from #2 for post-market surveillance. Device designers operating in an uncontrolled environment would probably benefit from #2 as well
-extended handler with format
-less literals in tracepoints
+=== Prototypes ===
+* For #1: [https://codereview.qt-project.org/185287 LTTNG and ETW tracing] use external description of Statically-Defined Tracepoints (SDT) to generate code for LTTNG and ETW backends (or no code at all)
+* For #2: Qt-based modmedLog library ([https://drive.google.com/file/d/0B_ZGW8rLIfE3aHVxR3loMThTbzA/view?usp=sharing presentation], [https://forge.imag.fr/frs/?group_id=982 forge]) uses Bind<T> and IData implementations to bind tracepoint data to JSON/XML/Console outputs and possibly read them back for replay
-TSV+JSON vs text-only
+=== Conclusion ===
+We should sit down and see how these 2 use cases intersect and the best things that are provided by each prototype
-Bind<TUserData>
+== Requirements ==
+For #1:
+* as little generated code as possible at the tracepoint:
+** no binary or numeric formatting
+** completely removable from release code (not just bypassing like QNoDebug::operator<<)
+For #2:
+* source compatibility with qDebug tracepoints
+** do not require non trivial changes such as changing from qDebug()<< to qDebug(...)
+** allow changing qDebug() to qTrace() or whatever to provide choice of old vs structured tracing
+** extensible to any user data type
+* offer structured trace formats (TSV+JSON, XML, sqlite?) to avoid low-level parsing
-format-free handler
+== Questions ==
+* Do we still obey QT_MESSAGE_PATTERN? YES and it could be extended with, say, %{format} %{arg0} ...
+* Can we generate efficient tracepoint code without an external tracepoint description and tool?
+* Should we provide a format+args handler delegating by default to the existing handler?
+(*QtMessageHandler2)(...,const QString & format, const QStringList & args);
+* Should we provide a formatting-independant handler based on Bind<T>:
+(*QtMessageHandler3)(...,const QString & format, const QList<BindableRef> & args);
+* What about modmedLog performance?
+* When variadic macros and templates are available, can they be used to get the best of both prototypes?
-performance requirements
+== Tracing/Logging survey ==
-smart trace points (id, count, sampling)
+=== Tracepoint kinds ===
+==== Dynamically-Defined ====
+Data layout built at run-time (usually text format but binary is possible too) :
+* qDebug()<<... (QString msg assembled by QDebug)
+* LTTng tracef(format,...) (null-terminated string formatted by printf)
+* sd_journal_send(format,...)
+* MacOS os_log(level,format,....)
+* ETW TraceEvent(...,buffer) (buffer containing assembled data and metadata)
+PROS: allows any kind of event data (nullable, dynamically-sized structures and arrays, variants, nested data)
+CONS: prohibits bare metal performance (runtime function calls for assembling event data, data structure must be explicitely repeated for all event occurences)
+USAGE: maximum acceptable effort for most app developer traces
+(see rsyslog author's remark on developers still not identifying their events 15+years after MS event log offered the possibility: http://blog.gerhards.net/2011/11/what-i-dont-like-about-journald.html )
+==== Statically-Defined ====
+(data layout fixed at compile-time)
+* External definition
+** ETW manifest for uniquely identified events
+** Proposed Qt tool https://codereview.qt-project.org/#/c/185287/
+* Macro-based assembling
+** MS TraceLoggingWrite() and related macros
+** LTTng tracepoint()
+PROS: allows bare metal performance (no heap allocation, fixed instruction pipeline, data structure removed from event occurences)
+CONS:
+* prohibits or complicates some dynamic event data (nullable, dynamically-sized structures and arrays, variants, nested data)
+* app developers will generally not take the burden to generate an UUID, add a prebuild step to their build and describe trace points data when they can just append it to a generic tracepoint (see above)
+USAGE: system troubleshooting for sysadmin
+==== Probes ====
+(no-ops that can be dynamically changed to collect data with DTrace or similar tools)
+* DTrace (on MacOS for instance)
+* Linux bcc/eBPF
+PROS: allows many live analysis with limited performance impact (no impact when unused)
+CONS: probe safety is a concern (solved by DTrace in general but still undesirable for certified software IMO)
+USAGE: performance analysis graal for experts only (DTrace author admits most DTrace users just run existing D scripts)
+=== Tracepoint data formats ===
+==== Binary ====
+* LTTng CTF (public, though it remains unclear how to pass dynamically-encoded CTF to LTTng library)
+* MS EtwSelfDescribingFormat (private, though TraceLogging macros may be used to understand it)
+* CBOR (https://tools.ietf.org/html/rfc7049, https://www.iana.org/assignments/cbor-tags/cbor-tags.xhtml)
+PROS: write performance mainly depends on memory copies, sometime on memory allocations and function calls
+CONS: difficult choice between interoperability and native toolsets
+==== Text ====
+* printf(format,...)
+* qDebug(format,...)
+* qDebug()<<...
+* LTTng tracef(format,...)
+CONS: write performance impaired by encoding of numbers, QString, QByteArray
+PROS: interoperability and evolutions facilitated
+==== Configurable ====
+* modmedLog LogOutput + IData
+* log4xx
+* ...
+=== Trace backends ===
+==== File ====
+PROS: interoperability, simple toolset
+CONS: unknown throughput, lacks basic logging features (log rotation)
+==== MS ETW / LTTng ====
+PROS: best throughput, async IO, unlogged events accessible from crash dump
+CONS: specialised and platform-dependent toolset unknown from most app developers
+==== MS EventViewer / Journald / syslog ====
+PROS: good throughput
+CONS: platform-dependent toolset
+==== Configurable ====
+* QtMessageHandler
+* modmedLog LogOutput
+* log4xx
+* ...