-
Notifications
You must be signed in to change notification settings - Fork 143
Implement Tailscale service #2679 #2712
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Tailscale service #2679 #2712
Conversation
Add support for Tailscale as a Service that can configured from
Rockstor's webUI. This commit assumes that the tailscale package is
already installed on the system.
- add "Tailscale: tailscaled" as a new service in the `prep_db.py`
script with no default configuration (null).
- add new urls endpoints with the same commands as other services:
config, start, stop.
- unlike other services, add 2 new endpoints ("children" of "config")
for the login/logout actions.
- add a new TailscaledServiceView to handle these endpoints.
- add logic to support services needing login/authentication.
- add tailscale utils functions to interface with the tailscale cli
binary.
- add flag to `system.osi.run_command()` to return raw output. This
commit does NOT change the default. Add relevant unit testing.
- increase default number of services shown in the table to 30.
- change services table column width and center buttons.
- ignore only the static/ folder located in root dir
- add unit testing to `system.services`: only partial coverage, though.
| should return | ||
| "hostname": "rock-dev-123" | ||
| """ | ||
| if "hostname" in config: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
quick question, can a hostname not have unicode characters in it? If that's the case, wouldn't some "inclusive" regex be better than excluding what we currently know as "special" characters"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh that is one very good question here. It's true that this validation is rather ugly at the moment...
To answer your question on unicode characters: I've tested using Tailscale's UI, which allows you to rename a hostname. This is a good way to test what is valid or not. I've pasted the following into the box: München, and it autocorrected to a hyphen:
It thus seems we can't use anything besides alphanumeric and -. I'll change to using a regex filtering/correction as you recommended.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll change to using a regex filtering/correction as you recommended.
Given this is a first draft, and a large one at that, I'm keen to get this out as-is. So this could be an enhancement by way of a follow-up. It likely won't be the last with such a significant addition.
Building now from what we have currently with a view to getting this out to the testing channel. We are getting along in this testing phase but not at the end yet so still time for some additions/refinements before we start heading for the next stable.
|
@FroggyFlox - this is amazing!!! Looks simple enough :). I had a question on the hostname validation that I placed as a comment in the pull request. We will definitely need the documentation for that in the Rockstor docs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@FroggyFlox This is a superb effort/addition, I'm keen to get this out into testing for proper field tests as this is not a significant knowledge area for me.
@Hooverdan96's point was keen and we can follow that up as we progress in the current testing channel if need be.
Your additional test coverage here is most welcome. And thanks for the excellent tests on the new code presented in this PR.
Noting the following single test failure when Tailscale is not installed on the rpm build host:
======================================================================
FAIL: test_disable_ip_forwarding (rockstor.system.tests.test_system_network.SystemNetworkTests)
test proper call of os.remove() and final sysctl command
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/buildbot/worker/Poetry-Build-on-Leap15-5/rpmbuild/rockstor-core-5.0.4-2712/src/rockstor/system/tests/test_system_network.py", line 795, in test_disable_ip_forwarding
self.mock_os_remove.assert_called_once_with("/etc/sysctl.d/99-tailscale.conf")
File "/usr/lib64/python3.9/unittest/mock.py", line 918, in assert_called_once_with
raise AssertionError(msg)
AssertionError: Expected 'remove' to be called once. Called 0 times.
----------------------------------------------------------------------
Ran 279 tests in 26.910s
FAILED (failures=1)
We may want to assess if this is acceptable or even avoidable?
After installing the repo and package (Leap 15.5 only currently) via the instructions in our rockstor-installer issue comment: rockstor/rockstor-installer#142 (comment)
We have all tests pass as per PR text.
Thanks for thinking to extend our services page: it would have been a shame for our ongoing page refresh bug to have returned us to page one and lost this entry.
If one submits the default config (accessed via spanner or auto shown via enable switch):
There is a:
Unknown internal error doing a POST to /api/sm/services/tailscaled/config
/opt/rockstor/var/log/rockstor.log
system.exceptions.CommandException: Error running a command. cmd = /usr/bin/systemctl stop tailscaled. rc = 5. stdout = ['']. stderr = ['Failed to stop tailscaled.service: Unit tailscaled.service not loaded.', '']
Am I doing it wrong currently ?
I have installed the package via my indicated comment above. But that is all. Trying to simulate folks doing the same to see how we cope with that situation.
|
So the full rockstor.log entry for what is likely an incomplete install of Tailscale on my part, is: I'll review my referenced Tailscale install comment. |
|
OK, some progress: In my test setup I have the following: So a partial install where the service has not been registered but installed !! |
|
OK, likely my fault here, I had assumed the failed test, which passed later on, was due to the package actually having been installed: but alas I had only added the repo !!! Strange how the indicated test failed initially (two builds, but later passed). Something likely interesting there. So thus far: user error on my part. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, so bar my "doing it wrong" and discovering an anomaly re one of the tests; once tailscale is actually installed, visiting the config and submitting (all defaults) I get the following "Login" appear:
rleap15-5:~ # systemctl status tailscaled.service
● tailscaled.service - Tailscale node agent
Loaded: loaded (/lib/systemd/system/tailscaled.service; enabled; vendor preset: disabled)
Active: active (running) since Thu 2023-10-19 14:10:29 WEST; 6min ago
Docs: https://tailscale.com/kb/
Process: 16372 ExecStartPre=/usr/sbin/tailscaled --cleanup (code=exited, status=0/SUCCESS)
Main PID: 16415 (tailscaled)
Status: "Stopped; run 'tailscale up' to log in"
Tasks: 10 (limit: 2334)
CGroup: /system.slice/tailscaled.service
└─ 16415 /usr/sbin/tailscaled --state=/var/lib/tailscale/tailscaled.state --socket=/run/tailscale/tailscaled.sock --port=41641
Oct 19 14:10:30 rleap15-5 tailscaled[16415]: Backend: logs: be:3c910da634826d267eded683e93b9dbe247ea758eb2c0207468b5298fd7a0244 fe:
Oct 19 14:10:30 rleap15-5 tailscaled[16415]: Switching ipn state NoState -> NeedsLogin (WantRunning=false, nm=false)
Oct 19 14:10:30 rleap15-5 tailscaled[16415]: blockEngineUpdates(true)
Oct 19 14:10:30 rleap15-5 tailscaled[16415]: health("overall"): error: state=NeedsLogin, wantRunning=false
Oct 19 14:10:30 rleap15-5 tailscaled[16415]: wgengine: Reconfig: configuring userspace WireGuard config (with 0/0 peers)
Oct 19 14:10:30 rleap15-5 tailscaled[16415]: wgengine: Reconfig: configuring router
Oct 19 14:10:30 rleap15-5 tailscaled[16415]: wgengine: Reconfig: configuring DNS
Oct 19 14:10:30 rleap15-5 tailscaled[16415]: dns: Set: {DefaultResolvers:[] Routes:{} SearchDomains:[] Hosts:0}
Oct 19 14:10:30 rleap15-5 tailscaled[16415]: dns: Resolvercfg: {Routes:{} Hosts:0 LocalDomains:[]}
Oct 19 14:10:30 rleap15-5 tailscaled[16415]: dns: OScfg: {}
rleap15-5:~ # echo $?
0
After using the "Login" which opened a new tab to initialte a Tailscale sign-up/login with machine registration attached. Once logged in we get:
I was informed/provided-with a link in the same session for admin to accept this new machine. This was done successfully. The Tailscale 'Login' in Rockstor services page had then changed to the on/off switch in On position.
And sharing this machine via an invite link with @FroggyFlox provided the capability to ping from their tailnet. This is with the default configuration.
As my working knowledge of Tailscale is somewhat limited I'd like to get this out into our testing channel so folks more knowledgable there an test what we have to-date.
And issue can be created for @Hooverdan96 key point re inclusive regex improvement on hostname validation and likely some enhancements to cope more elegantly with no Tailscale installed situation.
|
In addition @FroggyFlox has now tested Rockstor Web-UI access to the build machine used in this end-to-end RPM build/install test. So that's all looking quite promising. I'll go ahead and merge this so we can get to the next testing release. Some follow-ups likely but we can issue/pr them in turn.
N.B. only the default configuration has been tested by myself. |
|
Post a reboot of the test system, and before Web-UI login, |
Fixes #2679
@phillxnet, @Hooverdan96: ready for review
This pull-request (PR) proposes to add a new Service (in System > Services) for Tailscale, allowing the users to connect their Rockstor machine to their existing Tailscale account entirely from Rockstor's UI, without command line interaction. See related issue for rationale.
Overall implementation
Introductory notes
This PR makes the assumption that the
tailscalepackage is already installed on the system, providing thetailscaleddaemon systemd service as well as thetailscalecli interface (on which this PR entirely relies). Ensuring the tailscale package is installed should thus be the subject of a dedicated issue in our rockstor-installer repository.To better understand the implementation in this PR, it is important to note that there are two main components to Tailscale on the system:
tailscaleddaemon (systemd service): creates thetundevice and "routes" traffic. This service being active is not sufficient for having an active Tailscale connection. This is done by thetailscalebinary (see below).tailscalecli binary: represents the interface between the user and its Tailscale account and connection configuration. This binary requires thetailscaleddaemon to up and running to do anything.@phillxnet, Tailscale licensing is detailed at: https://github.com/tailscale/tailscale/blob/main/LICENSE
Their website (https://tailscale.com/opensource/) details:
Rockstor's implementation
tailscaledservice is NOT running.tailscaleddaemon. This creates thetundevice but does not activate the Tailscale connection. This step is a requirement for the initiation of the login/authentication mechanism.tailscale upcall; this generates an authURL that we then retrieve and use to open a new browser window that the user can thus use to complete the process of authentication to their Tailscale account. When completed, the Tailscale service is ON and the machine is fully connected to their Tailscale network.tailscaleddaemon as this PR assumes the user might still want to turn it back ON (shortly) thereafter. I'm thinking of using Ability to reset services to original status/settings #2332 here: when resetting the service to its default, we consider that the user does not want Tailscale anymore and we would thus stop thetailscaleddaemon then.Implementation details
Tailscale service configuration
As detailed in #2679 (comment),
tailscalesupports many options, and this PR proposes to provide a simple form to set what might be the most commonly used ones.For all other options, a custom configuration box is offered so that the user can manually enter what they want. Note that if the user enters in this custom configuration box a parameter that is already supported in the form above it, this PR considers what was defined in the form as taking precedence over the custom configuration and thus ignores it.
The configuration form includes a variety of Javascript-based validation to prevent the user to enter incompatible or invalid settings. Note that this covers the incompatibilities that I have personally witnessed... there may be more that I have not seen yet.
Just like all other services, submitting the configuration form sends that data to the new
TailscaledViewthat parses it and validates it as follows:hostnameparam is present: verify it is alphanumeric and does not include special characters. Underscores are also invalid so we change them to hyphens (same as what the Tailscale webUI does).After this validation, we save the config in the database.
This also enables and starts the
tailscaledsystemd service, required for the next steps.See below for a screenshot of the configuration modal:

Tailscale service status
Unlike other Services, Tailscale needs to be authenticated to function. We thus consider the Tailscale service as ON if the machine is connected to the user's Tailscale network. This, however, requires authentication, explained in the next section.
For Tailscale, however, we need to have a more granular knowledge of Tailscale status than for other services (due to the reasons above). Fortunately,
tailscaleincludes a subcommand (status) that can report a very detailed report in JSON. This includes one important key:BackendState. This key indeed is enough to give us the information we need:We thus have 3 states. On top of that, we also have the state of "tailscaled systemd service not active" -> 4 states. In reality, however, we just need to know the following:
tailscaleddaemon running?tailscaleddaemon is running:tailscaleneed to be authenticated (BackendState: "NeedsLogin")? If yes: show "Login" button (see below).In
services.js, we currently use return codes as returned by systemd/supervisorctl to represent services status to display the ON/OFF toggle accordingly:Note that systemd already returns: 0 = unit is active, 3 = unit is not active, 4 = no such unit (@phillxnet's notes).
Here, we need a new state: needs login. I thus chose to attribute a return code of 5 to represent that. This is detailed in
system.services.tailscale_service_status(). As a result, if service status is 5, we need to display the "Login" button.Tailscale authentication/login
Running
tailscale upat this point would hang, waiting for the user to visit an automatically generated URL to login/authenticate to their Tailscale account. As a result, this would make the front-end hang as well. I have thus ended up relying on the following "feature": whentailscale upis triggered, it checks if authentication is needed and generates an authURL, waiting for the user to copy this URL, paste it in the browser and then complete the authentication process. When completed,tailscale updetects that and finishes activating the tailscale connection. Now, the "detail" that this PR uses is that when that initialtailscale upcall that generates the needed authURL finishes (successfully or not), the generated authURL can be retrieved fromtailscale status, and is still valid for the user to complete the authentication process. Rockstor thus uses that as follows:loginaction (new definition in urls) that callstailscale upwith a timeout of 2 sec. That is enough to generate the authURL, and then completes. We can then calltailscale statusto retrieve this authURL and save it in the config model entry. We then refresh the Backbone collection for the services to have that authURL available to our front-end and use that to open a new window with that URL. That process is contained in thetailscaleAuthActionfunction inservices.js.Machine expiration
By default, each machine newly connected to a Tailscale network expires in 30 days.
I've tried to mimick this as follows:
tailscale statusshows it as Stopped, but not needing login, unfortunately, so we have the ON/OFF toggle still in Rockstor's UI.tailscale up, which then detects it needs login. In Rockstor UI, this is detected and we show the "Login" button. One can then complete the login process.It is not as smooth as I would like, but I'm unsure how to deal with this "weird" state given the Needs login state is not detected by Tailscale before we need to know it. Maybe a "normal" machine expiration is reflected differently than manually removing it from the dashboard?
The
tailscale upcommand does have a--force-reauthflag that will force a re-authentication. Maybe this should be included in the configuration modal as a button in case of need.Testing
All development and testing was conducted on Leap 15.5; all tests are passing (note that we now have 17 more tests included in this PR):
Miscellaneous
Services table
I had to make a few re-arrangements in the table listing all services:
text-align: leftcss that was forcing a left alignment for the buttons.The resulting table does have a lot of empty space in the center, though... This is another design choice to be discussed as/if desired.
Change to
run_command()**This PR does include a change to
system.osi.run_command()**. This change was required to easily load the JSON output fromtailscale status --jsonas a dict. I've slightly editedrun_command()to allow returning the "raw" output fromp.communicate()`, while leaving the default behavior the same as before.Help button link
The link included in the help button at the top of the configuration modal points to the docs' root. This should be changed to point to a dedicated description of the Tailscale service once it exists.
Change to .gitignore
We previously had
static/in.gitignoreto ignore the static dir created bycollectstatic. This, however, was not selective enough and was causingstatic/subfolders to be ignored. That includedconfigure_tailscaled.jst, for instance. The change in this PR should concern only thestatic/folder located in the project's root:/static/.How the new tailscale0 tun device shows in the Network page
tailscaledsystemd service is inactive:tailscaledsystemd service is active, but tailscale is OFF:Tailscale service ON:
