Skip to content

Conversation

@ridwanmsharif
Copy link
Contributor

Description

This change adds the prometheus metric receiver into the ops Agent hidden behind an environment variable UNSUPPORTED_BETA_PROMETHEUS_RECEIVER=enabled. If that env variable is set (using systemd unit file in linux, and visible to the process for windows), then the prometheus receiver can be configured. The integration tests and unit tests still run for the feature.

Related issue

b/250885264 | P2 | Merge private preview branch into master behind a feature gate

How has this been tested?

Checklist:

  • Unit tests
    • Unit tests do not apply.
    • Unit tests have been added/modified and passed for this PR.
  • Integration tests
    • Integration tests do not apply.
    • Integration tests have been added/modified and passed for this PR.
  • Documentation
    • This PR introduces no user visible changes.
    • This PR introduces user visible changes and the corresponding documentation change has been made.
  • Minor version bump
    • This PR introduces no new features.
    • This PR introduces new features, and there is a separate PR to bump the minor version since the last release already.
    • This PR bumps the version.

@ridwanmsharif ridwanmsharif force-pushed the ridwanmsharif/merge-prometheus branch 2 times, most recently from a912717 to 080bb26 Compare October 17, 2022 19:24
@ridwanmsharif ridwanmsharif force-pushed the ridwanmsharif/merge-prometheus branch from c32094e to 1a93bd3 Compare October 17, 2022 22:06
@ridwanmsharif ridwanmsharif marked this pull request as ready for review October 17, 2022 22:06
@ridwanmsharif
Copy link
Contributor Author

@qingling128 last 2 commits adds in changes from the preview branch. One adds in the environment variable featuregate. The other adds some licenses and lint fixes.

@ridwanmsharif ridwanmsharif force-pushed the ridwanmsharif/merge-prometheus branch from b1b1c9f to 7050fef Compare October 18, 2022 18:33
@ridwanmsharif ridwanmsharif force-pushed the ridwanmsharif/merge-prometheus branch from 7050fef to 7d6d1b2 Compare November 7, 2022 16:44
@ridwanmsharif ridwanmsharif force-pushed the ridwanmsharif/merge-prometheus branch from 7d6d1b2 to 2309a20 Compare November 7, 2022 20:56
Copy link
Contributor

@LujieDuan LujieDuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM module a couple of small things:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small thing: we can replace the $$ with $ here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for $$

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here with $$

ridwanmsharif and others added 8 commits November 8, 2022 19:25
* prometheus: add config generation for the prometheus receiver

This change does the following:
- [  ] Pulls in the googlemanagedprometheus exporter for prometheus
- [  ] Pulls in prometheus so we can use the exact same config structure
- [  ] Adds the config generation for prometheus receivers
- [  ] Refactor some of the pipeline logic so prometheus receivers have
  their own exporter
- [  ] Adds config validation for prometheus receivers
- [  ] Adds basic unit tests for the prometheus receiver

* prometheus: add more unit tests

Signed-off-by: Ridwan Sharif <ridwanmsharif@google.com>

* prometheus: regesiter all service discovery implementations so yaml parsing doesn't fail

Signed-off-by: Ridwan Sharif <ridwanmsharif@google.com>

* prometheus: report error in platform-agnostic way

Signed-off-by: Ridwan Sharif <ridwanmsharif@google.com>

Signed-off-by: Ridwan Sharif <ridwanmsharif@google.com>
* prometheus: add metadata labels to every static config

This change adds the following:
- [] Hooks up the receiver to the metadata detector
- [] Adds labels to every static config
- [] Adds unit tests
- [] Adds integration tests
- [] Disallows updating namespace, location and cluster labels

* integration_test: ignore instance_id label for prom metrics

* prometheus: add groupbyattrs processor so namespace, location and cluster fields can be used

Signed-off-by: Ridwan Sharif <ridwanmsharif@google.com>

Signed-off-by: Ridwan Sharif <ridwanmsharif@google.com>
* prometheus: use prometheus styled regex isntead of OTel

This is mainly focussed on the `replacement` field and us not using
the otel styled `$` syntax for the user visible prom config.

* prometheus: deep copy using marshal and unmarshal before updating regex

Signed-off-by: Ridwan Sharif <ridwanmsharif@google.com>

* prometheus: simplify deepcopy and escaping of $ in replacement strings

Signed-off-by: Ridwan Sharif <ridwanmsharif@google.com>

Signed-off-by: Ridwan Sharif <ridwanmsharif@google.com>
@ridwanmsharif ridwanmsharif force-pushed the ridwanmsharif/merge-prometheus branch from 2309a20 to 0f0c8ab Compare November 8, 2022 19:30
@ridwanmsharif ridwanmsharif added the kokoro:force-run Forces kokoro to run integration tests on a CL label Nov 8, 2022
@ridwanmsharif ridwanmsharif force-pushed the ridwanmsharif/merge-prometheus branch from fd39ed9 to 058eefe Compare November 8, 2022 21:00
@stackdriver-instrumentation-release stackdriver-instrumentation-release removed the kokoro:force-run Forces kokoro to run integration tests on a CL label Nov 8, 2022
@ridwanmsharif ridwanmsharif added the kokoro:force-run Forces kokoro to run integration tests on a CL label Nov 8, 2022
@stackdriver-instrumentation-release stackdriver-instrumentation-release removed the kokoro:force-run Forces kokoro to run integration tests on a CL label Nov 8, 2022
@ridwanmsharif
Copy link
Contributor Author

Test failures look unrelated. Merging to master

@ridwanmsharif ridwanmsharif merged commit 56afd8c into master Nov 9, 2022
@ridwanmsharif ridwanmsharif deleted the ridwanmsharif/merge-prometheus branch November 9, 2022 02:01
avilevy18 added a commit that referenced this pull request Nov 18, 2022
Attempting to force flush feature tracking metrics

Attempting to force flush feature tracking metrics

Attempting to force flush feature tracking metrics

Attempting to force flush feature tracking metrics

Attempting to force flush feature tracking metrics

Attempting to force flush feature tracking metrics

Attempting to force flush feature tracking metrics

Attempting to force flush feature tracking metrics

Attempting to force flush feature tracking metrics

Attempting to force flush feature tracking metrics

Attempting to force flush feature tracking metrics

Attempting to force flush feature tracking metrics

Attempting to force flush feature tracking metrics

Attempting to force flush feature tracking metrics

Attempting to force flush feature tracking metrics

Attempting to force flush feature tracking metrics

Attempting to force flush feature tracking metrics

prometheus: add receiver for ingesting prometheus metrics using the Ops Agent (#904)

* prometheus: add config generation for the prometheus receiver (#844)

* prometheus: add config generation for the prometheus receiver

This change does the following:
- [  ] Pulls in the googlemanagedprometheus exporter for prometheus
- [  ] Pulls in prometheus so we can use the exact same config structure
- [  ] Adds the config generation for prometheus receivers
- [  ] Refactor some of the pipeline logic so prometheus receivers have
  their own exporter
- [  ] Adds config validation for prometheus receivers
- [  ] Adds basic unit tests for the prometheus receiver

* prometheus: add more unit tests

Signed-off-by: Ridwan Sharif <ridwanmsharif@google.com>

* prometheus: regesiter all service discovery implementations so yaml parsing doesn't fail

Signed-off-by: Ridwan Sharif <ridwanmsharif@google.com>

* prometheus: report error in platform-agnostic way

Signed-off-by: Ridwan Sharif <ridwanmsharif@google.com>

Signed-off-by: Ridwan Sharif <ridwanmsharif@google.com>

* prometheus: add metadata labels to every static config (#872)

* prometheus: add metadata labels to every static config

This change adds the following:
- [] Hooks up the receiver to the metadata detector
- [] Adds labels to every static config
- [] Adds unit tests
- [] Adds integration tests
- [] Disallows updating namespace, location and cluster labels

* integration_test: ignore instance_id label for prom metrics

* prometheus: add groupbyattrs processor so namespace, location and cluster fields can be used

Signed-off-by: Ridwan Sharif <ridwanmsharif@google.com>

Signed-off-by: Ridwan Sharif <ridwanmsharif@google.com>

* prometheus: use prometheus styled regex instead of OTel (#886)

* prometheus: use prometheus styled regex isntead of OTel

This is mainly focussed on the `replacement` field and us not using
the otel styled `$` syntax for the user visible prom config.

* prometheus: deep copy using marshal and unmarshal before updating regex

Signed-off-by: Ridwan Sharif <ridwanmsharif@google.com>

* prometheus: simplify deepcopy and escaping of $ in replacement strings

Signed-off-by: Ridwan Sharif <ridwanmsharif@google.com>

Signed-off-by: Ridwan Sharif <ridwanmsharif@google.com>

* Prometheus receiver: add integration test with JSON exporter (#869)

* prometheus: disable receiver by default

* prometheus: presubmit update license and yamlfmt

* prometheus: skip integration test on centos

* prometheus: address PR comments

* prometheus: update golden files

Signed-off-by: Ridwan Sharif <ridwanmsharif@google.com>
Co-authored-by: Lujie Duan <lujieduan@google.com>

Add caching to Windows build (#939)

* testing windows caching

* comment

* new fancy run

* try moving submodule update into dockerfile

* install git in dockerfile

* productionize flow

* comments resolved

* removed blank lines

* extra space

* remove cache missing logic from dockerfile

Add workaround for Windows 2012 fluent-bit lockups (#952)

* Add workaround for Windows 2012 fluent-bit lockups

See b/240564518 for more background.

* Revert "Testing: Add workaround for windows-2012 flakes"

This reverts commit a0c21e3.

* Revert force-restart workaround for Windows 2012

Testing : Add retries on `sudo sed` command when setting up SUSE test VM. (#956)

* Add retries on `sudo sed` command when setting up SUSE test VM.

* Add sles specific constants more max attempts and backoff duration.

Get bison package from team vendor repo (#954)

integration_test: skip prometheus tests on rhel (#959)

Add compatible restart command for sles-15-sap (#960)

Testing: change restart command to work on SLES-15 (#961)

Let's try just removing the `.target` option.

Use docker-credential-gcr instead of gcloud to match kokoro's prefetching logic (#964)

Fix missing opensuse condition (#974)

Fixes `TestPrometheusMetricsWithJSONExporter/opensuse-leap*`.

Add startup delay on SUES platforms (#976)

Add ZYPP_LOCK_TIMEOUT to reduce flakes (#975)

Vault install and user documentation update (#973)

* add metric policy to script and replace init references

* add cleaner enable script that will guide users if they do not follow the configuration options

* add configure_integration documentation

* update doc nit

Internal: tests install `go` from a GCS bucket (#977)

This is to prevent flakes due to `golang.org` throttling us. :)

Strip out mentions of winrm.par (#925)

Use absolute path for mkswap and swapon (#978)

We're still not sure why `mkswap` is randomly failing on sles-15-sap,
but providing the absolute path does seem to help...

Remove sudo from scripts along with updated docker install (#955)

* Try skipping "update docker" step

* also print out docker version

* remove more obsolete steps

* focal masquerades as hirsute

* try again with jammy

* test removing sudo

* focal

Co-authored-by: Martijn van Schaardenburg <martijnvs@google.com>

Fall back to unqualified mkswap (#979)

Some platforms, e.g. bionic, have mkswap under a different folder.
We haven't had any problems running mkswap unqualified on bionic, or
anywhere outside of SLES really, so add a fallback to the unqualified
version of the command if the absolute version fails.

Testing: Run Oracle DB test in a more normal way (#893)

Update VERSION (#987)

Update minimum_supported_agent_version in metadata.yaml. (#988)

Co-authored-by: Rafael Westphal <westphalrafael@google.com>

Attempting to force flush feature tracking metrics

Try getting sudo access before running tests (#986)

resourcedetector: Get default service account scopes. (#984)

* Add getDefaultScopes() to resourcedetector.

* Add `getSlice()` to testin FakeProvider.

* Verify DefaultScopes in TestGettingResourceWithoutError.
avilevy18 added a commit that referenced this pull request Dec 6, 2022
* Adding support for feature tracking

* Added feature tracking into `CollectOpsAgentSelfMetrics()``

* Added feature tracking metric in `expected_metric` metadata.yaml

* Added confgenerator import to `main_windows`

* Fix bug

prometheus: add receiver for ingesting prometheus metrics using the Ops Agent (#904)

* prometheus: add config generation for the prometheus receiver (#844)

* prometheus: add config generation for the prometheus receiver

This change does the following:
- [  ] Pulls in the googlemanagedprometheus exporter for prometheus
- [  ] Pulls in prometheus so we can use the exact same config structure
- [  ] Adds the config generation for prometheus receivers
- [  ] Refactor some of the pipeline logic so prometheus receivers have
  their own exporter
- [  ] Adds config validation for prometheus receivers
- [  ] Adds basic unit tests for the prometheus receiver

* prometheus: add more unit tests

Signed-off-by: Ridwan Sharif <ridwanmsharif@google.com>

* prometheus: regesiter all service discovery implementations so yaml parsing doesn't fail

Signed-off-by: Ridwan Sharif <ridwanmsharif@google.com>

* prometheus: report error in platform-agnostic way

Signed-off-by: Ridwan Sharif <ridwanmsharif@google.com>

Signed-off-by: Ridwan Sharif <ridwanmsharif@google.com>

* prometheus: add metadata labels to every static config (#872)

* prometheus: add metadata labels to every static config

This change adds the following:
- [] Hooks up the receiver to the metadata detector
- [] Adds labels to every static config
- [] Adds unit tests
- [] Adds integration tests
- [] Disallows updating namespace, location and cluster labels

* integration_test: ignore instance_id label for prom metrics

* prometheus: add groupbyattrs processor so namespace, location and cluster fields can be used

Signed-off-by: Ridwan Sharif <ridwanmsharif@google.com>

Signed-off-by: Ridwan Sharif <ridwanmsharif@google.com>

* prometheus: use prometheus styled regex instead of OTel (#886)

* prometheus: use prometheus styled regex isntead of OTel

This is mainly focussed on the `replacement` field and us not using
the otel styled `$` syntax for the user visible prom config.

* prometheus: deep copy using marshal and unmarshal before updating regex

Signed-off-by: Ridwan Sharif <ridwanmsharif@google.com>

* prometheus: simplify deepcopy and escaping of $ in replacement strings

Signed-off-by: Ridwan Sharif <ridwanmsharif@google.com>

Signed-off-by: Ridwan Sharif <ridwanmsharif@google.com>

* Prometheus receiver: add integration test with JSON exporter (#869)

* prometheus: disable receiver by default

* prometheus: presubmit update license and yamlfmt

* prometheus: skip integration test on centos

* prometheus: address PR comments

* prometheus: update golden files

Signed-off-by: Ridwan Sharif <ridwanmsharif@google.com>
Co-authored-by: Lujie Duan <lujieduan@google.com>

Add caching to Windows build (#939)

* testing windows caching

* comment

* new fancy run

* try moving submodule update into dockerfile

* install git in dockerfile

* productionize flow

* comments resolved

* removed blank lines

* extra space

* remove cache missing logic from dockerfile

Add workaround for Windows 2012 fluent-bit lockups (#952)

* Add workaround for Windows 2012 fluent-bit lockups

See b/240564518 for more background.

* Revert "Testing: Add workaround for windows-2012 flakes"

This reverts commit a0c21e3.

* Revert force-restart workaround for Windows 2012

Testing : Add retries on `sudo sed` command when setting up SUSE test VM. (#956)

* Add retries on `sudo sed` command when setting up SUSE test VM.

* Add sles specific constants more max attempts and backoff duration.

Get bison package from team vendor repo (#954)

integration_test: skip prometheus tests on rhel (#959)

Add compatible restart command for sles-15-sap (#960)

Testing: change restart command to work on SLES-15 (#961)

Let's try just removing the `.target` option.

Use docker-credential-gcr instead of gcloud to match kokoro's prefetching logic (#964)

Fix missing opensuse condition (#974)

Fixes `TestPrometheusMetricsWithJSONExporter/opensuse-leap*`.

Add startup delay on SUES platforms (#976)

Add ZYPP_LOCK_TIMEOUT to reduce flakes (#975)

Vault install and user documentation update (#973)

* add metric policy to script and replace init references

* add cleaner enable script that will guide users if they do not follow the configuration options

* add configure_integration documentation

* update doc nit

Internal: tests install `go` from a GCS bucket (#977)

This is to prevent flakes due to `golang.org` throttling us. :)

Strip out mentions of winrm.par (#925)

Use absolute path for mkswap and swapon (#978)

We're still not sure why `mkswap` is randomly failing on sles-15-sap,
but providing the absolute path does seem to help...

Remove sudo from scripts along with updated docker install (#955)

* Try skipping "update docker" step

* also print out docker version

* remove more obsolete steps

* focal masquerades as hirsute

* try again with jammy

* test removing sudo

* focal

Co-authored-by: Martijn van Schaardenburg <martijnvs@google.com>

Fall back to unqualified mkswap (#979)

Some platforms, e.g. bionic, have mkswap under a different folder.
We haven't had any problems running mkswap unqualified on bionic, or
anywhere outside of SLES really, so add a fallback to the unqualified
version of the command if the absolute version fails.

Testing: Run Oracle DB test in a more normal way (#893)

Update VERSION (#987)

Update minimum_supported_agent_version in metadata.yaml. (#988)

Co-authored-by: Rafael Westphal <westphalrafael@google.com>

Attempting to force flush feature tracking metrics

Try getting sudo access before running tests (#986)

resourcedetector: Get default service account scopes. (#984)

* Add getDefaultScopes() to resourcedetector.

* Add `getSlice()` to testin FakeProvider.

* Verify DefaultScopes in TestGettingResourceWithoutError.

* Refactoring tests to include internal metrics

Refactoring tests to include internal metrics

* Refactoring tests to include internal metrics

* Fixed dependencies

* Testing third party integrations - active_directory_ds

* Testing third party integrations - activemq

* Testing third party integrations - apache

* Added extra expected metric for active_directory_ds

* Fixed bug where feature extraction did not properly capture values of pointers

* Testing third party integrations - aerospike

* Merged, and fixed `go.mod`

* Fixed go.sum

* Addressed comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants