-
Notifications
You must be signed in to change notification settings - Fork 6k
[Diangostics][dotnet-trace] Add collect-linux verb #47894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[Diangostics][dotnet-trace] Add collect-linux verb #47894
Conversation
bc293c0
to
e1a3ec7
Compare
@@ -43,7 +43,7 @@ The `dotnet-trace` tool: | |||
* Is a cross-platform .NET Core tool. | |||
* Enables the collection of .NET Core traces of a running process without a native profiler. | |||
* Is built on [`EventPipe`](./eventpipe.md) of the .NET Core runtime. | |||
* Delivers the same experience on Windows, Linux, or macOS. | |||
* On Linux, provides additional integration with kernel user_events for native tracing tool compatibility. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to tweak the wording a bit later, but not focusing on this for the moment :)
### Synopsis | ||
|
||
```dotnetcli | ||
dotnet-trace collect-linux |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can prune away some of these options
- buffersize - If OneCollect can reasonably pick the size then we may not need an option to explicitly set it.
- diagnostics-port - I think we'd only need this in some advanced scenarios. Given that we could be profiling multiple processes maybe we'd even need multiple ports? I'd suggest lets leave it out for now and wait to see what happens.
- resume-runtime - if we don't have diagnostics port then we shouldn't need this.
- event-filters - I'd suggest we leave this out for now to keep things simpler.
- tracepoint-configs - can we leave this out? If the user cares about the tracepoint names that imples they are going to use some other tool besides dotnet-trace to record the events. But if that is true I'm not sure why they'd want dotnet-trace to be creating a nettrace file as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would there be a scenario where users are capturing events with another tool, but they want to start capturing runtime events from their .NET app as well, and just use dotnet-trace as the medium to start a user_events EventPipeSession?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can conceive of that scenario, but I'm suggesting we don't do anything to proactively support it right now to keep the scope smaller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pruned and grouped the options into functionality (the previous order already deviates from dotnet-trace collect --help
, and unless there's a reason to keep the options in some sort of alphabetical order, I think the functional order is easier to understand)
[--diagnostic-port] | ||
[-o|--output <trace-file-path>] | ||
[-p|--process-id <pid>] | ||
[--profile <profile-name>] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll need to decide what profiles are available and which events they each collect. I expect this is mostly the same as for the 'collect' verb, but cpu-sampling probably should be different. We also might want a thread-time profile that collects context switches.
If we do update these profiles, we should strongly consider also renaming the highly misleading "cpu-sampling" profile for the collect verb. Currently that profile collects thread-time information, not CPU samples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I lightly updated the profiles for both collect and collect-linux. Still up for refinement once I have a better understanding of how to mix/match the .NET events and linux perf events. I'm wondering if we should allow --profiles
if we want to offer several orthogonal profiles and allow customers to mix/match. We already have some logic to mix a profile with the providers/keywords as well.
# All enabled events from MyCustomProvider will be written to MyCustomProvider_custom_events | ||
``` | ||
|
||
- **`--kernel-events <list-of-kernel-events>` (optional)** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should work with Beau to figure out what kind of event names can be supported. For example https://man7.org/linux/man-pages/man1/perf-record.1.html shows a whole bunch of different things can be specified as an event and I have no idea what part of that OneCollect handles.
I'm hoping we can at least support anything in /sys/kernel/tracing/available_events as well as the symbolic PMU events like cpu-cycles.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like all events under /sys/kernel/tracing/available_events can be enabled, so I changed the wording to suggest that all of those events found there or found categorically under /sys/kernel/tracing/events/ can be enabled through this option --perf-event-tracepoints
.
| `net` | Network-related events | `net:netif_rx`, `net:net_dev_xmit` | | ||
| `fs` | Filesystem I/O events | `ext4:*`, `vfs:*` | | ||
| `mm` | Memory management events | `kmem:*`, `vmscan:*` | | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If someone specified "sched:sched_wakeup,sched:sched_switch" what provider name, event name, and field lists would we expect to show up in the nettrace file? (This level of detail may not go into the docs, but we should understand it ourselves to decide what info should go in the docs)
- **perf**: Use `perf list user_events*` to see available events | ||
- **System monitoring tools**: Any tool that can consume Linux tracepoints | ||
|
||
### Examples |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lower in this doc are some examples of using dotnet-trace collect in various situations. No need just yet, but we'd probably want to update or add to those examples for the collect-linux functionality once we've clarified what it will be.
This reverts commit e1a3ec7.
PoliCheck Scan ReportThe following report lists PoliCheck issues in PR files. Before you merge the PR, you must fix all severity-1 issues. Other issues are also a high priority. The AI Review Details column lists suggestions for either removing or replacing the terms. If you find a false positive result, mention it in a PR comment and include this text: #policheck-false-positive. This feedback helps reduce false positives in future scans. ✅ No issues foundMore information about PoliCheckInformation: PoliCheck | Severity Guidance | Term |
Summary
Add new
collect-linux
verb to utilize new user_events based event tracing.Internal previews