Skip to content

feat: Add Pyroscope tools #170

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Jun 14, 2025
Merged

feat: Add Pyroscope tools #170

merged 15 commits into from
Jun 14, 2025

Conversation

bryanhuhta
Copy link
Contributor

@bryanhuhta bryanhuhta commented Jun 11, 2025

Adds basic tools to interact with Pyroscope data:

  • ListPyroscopeLabelNames - used to find which label names are available
  • ListPyroscopeLabelValues - used to find which label values are available for a label name
  • ListPyroscopeProfileTypes - used to discover which profile types are available
  • FetchPyroscopeProfile - the meat and potatoes, can fetch a profile in DOT format for analysis (akin to Flame Graph AI in Profiles Drilldown)

Demos:

demo.mov
demo.mp4

To do:

  • Fix the "too many tokens" limitation (perhaps don't use DOT format profiles?)
  • Add tests

cc @Rperry2174 @petethepig

@bryanhuhta bryanhuhta self-assigned this Jun 11, 2025
@bryanhuhta bryanhuhta marked this pull request as ready for review June 11, 2025 22:27
@bryanhuhta bryanhuhta requested a review from a team as a code owner June 11, 2025 22:27
@bryanhuhta
Copy link
Contributor Author

I tried to copy the e2e tests that Loki uses, but it was challenging predicting the order in which the LLM would call the tools. When testing fetch_pyroscope_profile, sometimes it would call list_pyroscope_label_names first, others list_pyroscope_label_values.

Copy link
Collaborator

@sd2k sd2k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Bryan, this is looking great. A couple of things:

  • yeah the e2e tests can be a bit fiddly, we can add those later on no problem. Would you be able to add some integration tests instead? See tools/loki_test.go for an example.
  • could you update the README to include the new tools in the table & list of features?

Otherwise this LGTM!

@ioanarm
Copy link
Contributor

ioanarm commented Jun 12, 2025

Thanks for the PR @bryanhuhta!
On the e2e tests when I had a similar issue with tool calling, I tried to iterate a bit on the tool descriptions but also on the testing prompts. If you explicitly ask to call one tool, or expecting to only do one tool call, then it shouldn't call more (this is just an example - I am not suggesting to ask to only call one tool 😋 ). If you want, push a commit with your e2e tests and I can try and help as well!

If you'd like, feel free to push a commit with your e2e tests and I'd be happy to help troubleshoot. Alternatively, we could tackle the e2e testing in a separate PR since the integration tests should be enough for this one.

Copy link

@edwardcqian edwardcqian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found a typo, otherwise I think this look great as a first pass.

I do think fetch_pyroscope_profile probably need a bit more examples in the description before it can be super useful, at least in my limited testing a lot of prompting is need to get an actual profile result

}

var ListPyroscopeProfileTypes = mcpgrafana.MustTool(
"list_pyroscope_label_values",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name is incorrect 😅

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This prompted me to open an issue with mcp-go mark3labs/mcp-go#390

right now it silently overrides the tools if two share the same name

@bryanhuhta
Copy link
Contributor Author

I do think fetch_pyroscope_profile probably need a bit more examples in the description before it can be super useful, at least in my limited testing a lot of prompting is need to get an actual profile result

@edwardcqian We have a pretty effective prompt that powers our Flame Graph AI feature in Profiles Drilldown which I considered reusing (at least in spirit, if not exactly).

https://github.com/grafana/profiles-drilldown/blob/e6abda18c6c8943f0e1a0c9165c90769be5455fd/src/pages/ProfilesExplorerView/components/SceneAiPanel/domain/buildLlmPrompts.ts#L25-L43

However, I ultimately decided to try keep it more open ended to allow users to guide themselves along the exploration of the profile, rather than being forced down a prescribed path. I'm happy to tweak the prompt to improve the experience, but I would like to do it in a way that leaves most of the prompting up to the user.

What are some problems you faced when you tested it out? Was it difficult to get the LLM to actually fetch a profile? Or did you have challenges getting it to perform an analysis on the profile?

@bryanhuhta
Copy link
Contributor Author

@sd2k @ioanarm w.r.t. e2e tests, I added integration tests (modeled off the Loki ones) and plan to add e2e tests in a follow up PR

@edwardcqian
Copy link

edwardcqian commented Jun 12, 2025

I had some trouble getting it to query profiles.

It's not quite doing the data source discovery -> label search + value -> profile type -> query flow without being asked.

Maybe LLMs are a little more aware of metrics querying than profiles? 🤔

@bryanhuhta
Copy link
Contributor Author

bryanhuhta commented Jun 12, 2025

Can you share some sample prompts? I had a similar experience if I used prompts like

Analyze the CPU usage of service X

Usually the LLM would default to analyzing Prometheus CPU metrics (not usually helpful). But if I was a bit more prescriptive like

Can you help me optimize CPU usage of service X?

I could reliably get various models to fetch a CPU profile and analyze it.

@edwardcqian
Copy link

edwardcqian commented Jun 12, 2025

I'm testing with ops, so it's a little bit more challenging than a test environment.

I'm asking "Analyze memory usage in the "service" service in "cluster" cluster, "namespace" namespace (please use pyroscope if possible)." (forgot this repo is public, removing internal specific terms 😅)

What helped is specifically call out the other tools in fetch_pyroscope_profiles's description:

 Matchers are not required, but highly recommended, they are generally used to select an application by the service_name label (e.g. `service_name=\"foo\"`). Use the list_pyroscope_label_names tool to fetch available label names, and the list_pyroscope_label_values tool to fetch available label values.

With this it ended up actually getting something.

@bryanhuhta bryanhuhta requested review from edwardcqian and sd2k June 13, 2025 21:37
@bryanhuhta
Copy link
Contributor Author

@edwardcqian Can you take another peek? In 7e56cb5 I updated the prompt to include your changes.

Copy link

@edwardcqian edwardcqian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm, we can add e2e test in another PR

@sd2k sd2k merged commit 60a7390 into main Jun 14, 2025
14 of 19 checks passed
@sd2k sd2k deleted the pyroscope branch June 14, 2025 10:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants