Skip to content

Conversation

@LujieDuan
Copy link
Contributor

@LujieDuan LujieDuan commented Sep 8, 2022

Description

This change introduces a new package in confgenerator called resourcedetector to detect metadata of the environment Ops Agent is running in. The detected metadata can then be used by receivers/processors. Right now we only have a detector for GCE environment; the detector is not currently used by any components; this will be used by the Prometheus receivers.

Related issue

b/244434005

How has this been tested?

This has been tested with unit tests and manual tests on GCE.
Integration tests for this is TBD.

Checklist:

  • Unit tests
    • Unit tests do not apply.
    • Unit tests have been added/modified and passed for this PR.
  • Integration tests
    • Integration tests do not apply.
    • Integration tests have been added/modified and passed for this PR.
  • Documentation
    • This PR introduces no user visible changes.
    • This PR introduces user visible changes and the corresponding documentation change has been made.
  • Minor version bump
    • This PR introduces no new features.
    • This PR introduces new features, and there is a separate PR to bump the minor version since the last release already.
    • This PR bumps the version.

Copy link
Contributor

@ridwanmsharif ridwanmsharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks pretty solid. There are a few transformations and nuances here around differences from how the gce detector in prometheus collects things (Compute engine API) and how we do (metadata server). We should add an integration test in addition to the unit tests in this PR to make sure the metadata attributes look like what we expect for the prometheus use case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest keeping this detector generic and leave all the translation to the prometheus style labels to the prometheus receiver. This detector is useful for other components too. One such example is: b/219991181 | P3 | Allow LogName to be used in processors

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

labels feels like a prometheus specific thing to me. It can get confusing especially since the metadata has its own labels too. Naming it instead to something like metadata or attributes might make it clearer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a single label or nested? I thought it was a nested one but could be wrong.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add some comments here around what is available in the detector now and what is not

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The contract we expose should probably be more clear - instead of having map[string]string it could be a struct that has fields for all the attributes. This way it is part of the public interface what fields are available as part of the detector, instead of maintaining another set of key labels.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are very interesting mutations we have to do. I'd leave it out of the detector package and add it when the prometheus receiver uses it. Only it cares about the Prometheus SD's conventions.

@LujieDuan LujieDuan force-pushed the lujieduan-prometheus-metadata branch from 5482025 to 3a79286 Compare September 12, 2022 15:24
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it would be better to call this UnrecognizedPlatformDetector or something similar.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for UnrecognizedPlatformDetector.

@LujieDuan LujieDuan marked this pull request as ready for review September 12, 2022 16:53
Copy link
Contributor

@ridwanmsharif ridwanmsharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking very good! Left a couple comments

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for UnrecognizedPlatformDetector.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This struct and interface seems pretty confusing to me. It feels simpler to have the provider do both the fetching of the keys and then aggregate the values rather than having the detector work with this interface.

In other words, if the provider provided getMetadata, getLabels and getLabels methods that each returned a map[string]string, the GCE detector wouldn't need to do any of this nesting and unnesting logic right? That would be contained in the implementation of the getMetadata, getLabels and getLabels methods. Let me know what you think of this approach.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like it belongs in the integration_test folder right?

Also you will be able to test a lot more of the fields there since we have the VM information when the integration test sets it up:

type VM struct {
Name string
Project string
Network string
Platform string
Zone string
MachineType string
ID int64
// The IP address to ssh/WinRM to. This is the external IP address, unless
// USE_INTERNAL_IP is set to 'true'. See comment on extractIPAddress() for
// rationale.
IPAddress string
// WindowsCredentials is only populated for Windows VMs.
WindowsCredentials *WindowsCredentials
AlreadyDeleted bool
}

See this as an example of what is used when setting up the environment for an integration test:

ctx, logger, vm := agents.CommonSetup(t, platform)

Copy link
Contributor

@ridwanmsharif ridwanmsharif Sep 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what the best way is to integration test this. Maybe the easiest way is to have a small go command line runner that hits your interface and dumps the metadata into a json file. The integration test can then read that from the VM and compare it to what it expects. Just a thought, figured its worth sharing. Looking forward to whatever you end up going with.

@LujieDuan LujieDuan force-pushed the lujieduan-prometheus-metadata branch from 1696c5b to 939a738 Compare September 15, 2022 19:21
@LujieDuan LujieDuan added the kokoro:force-run Forces kokoro to run integration tests on a CL label Sep 15, 2022
@stackdriver-instrumentation-release stackdriver-instrumentation-release removed the kokoro:force-run Forces kokoro to run integration tests on a CL label Sep 15, 2022
@LujieDuan LujieDuan changed the title WIP: Fetch meta labels for Prometheus from GCE VM Metadata Server Fetch meta labels for Prometheus from GCE VM Metadata Server Sep 15, 2022
@LujieDuan LujieDuan requested review from a team and ridwanmsharif September 15, 2022 21:24
Copy link
Contributor

@ridwanmsharif ridwanmsharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I'd get another reviewer however since we were pretty involved in this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] naming feels a bit off to me. GCEDetector.Project sounds like the project of the detector and not the metadata of the VM that is being detected. I'm struggling to come up with a better name/interface that works for both GCE and non-GCE resource detection however. Maybe we can just do the simple thing and rename this and Detector to Resource and the method GetDetector to GetResource. And we can add a comment that in GCE VM land the resource is the VM metadata.

It is a nit pick tho. Feel free to ignore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good idea - I have changed the name to Resource.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also can we add some comments documenting how to use the interface. Its hard to tell from the interface, how it is meant to be used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comments to what the interface means and how to cast to get an instance of the underlying type such as GCEResource.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also add what it needs to be casted to (for example GCEDetector)

@LujieDuan LujieDuan force-pushed the lujieduan-prometheus-metadata branch from bd09687 to b179df4 Compare September 16, 2022 13:59
Copy link
Contributor

@ridwanmsharif ridwanmsharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wooo LGTM!

@LujieDuan LujieDuan added the kokoro:force-run Forces kokoro to run integration tests on a CL label Sep 16, 2022
@stackdriver-instrumentation-release stackdriver-instrumentation-release removed the kokoro:force-run Forces kokoro to run integration tests on a CL label Sep 16, 2022
@LujieDuan LujieDuan changed the title Fetch meta labels for Prometheus from GCE VM Metadata Server Fetch metadata from GCE VM Metadata Server Sep 16, 2022
@LujieDuan LujieDuan requested a review from braydonk September 16, 2022 15:31
@LujieDuan LujieDuan force-pushed the lujieduan-prometheus-metadata branch from b179df4 to c38b059 Compare September 19, 2022 14:01
@LujieDuan LujieDuan merged commit 6272ea2 into master Sep 19, 2022
@igorpeshansky igorpeshansky deleted the lujieduan-prometheus-metadata branch July 10, 2023 21:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants