Skip to content

Unify CLI rule specification options with rule file options#5

Closed
aszlig wants to merge 2 commits into
masterfrom
rulefile-unification
Closed

Unify CLI rule specification options with rule file options#5
aszlig wants to merge 2 commits into
masterfrom
rulefile-unification

Conversation

@aszlig

@aszlig aszlig commented Sep 15, 2019

Copy link
Copy Markdown
Contributor

Re-issue of #4 in the hopes it won't get merged this time before the questions here are resolved.

Since the rule file was implemented first, the option names are still unchanged to date, but when introducing the -r command line option, I decided to go for short names instead.

This however is a bit confusing if you're used to -r and suddenly have to use different option names for the rule files.

Some options however are a bit difficult to unify, because they have different semantics:

  • The portEnd rule file option doesn't exist as an extra option in the CLI specification, but needs to be specified in a range notation, like eg. port=200-300.
  • Likewise, the fdName option is simply systemd=some_fd_name, but it might be a good idea to nest the option, resulting in systemd.enable and systemd.fdName.
  • The reject option also has three states in the CLI specification: disabled, enabled and enabled with errno.

I'm not quite sure what would be the best way to handle this yet, because if we'd use nested attributes, the port option would still be different because it's not simply a boolean with an optional string/integer value. When translating the port CLI option into a nested object, it would look like this for two different rules:

- path: foo
  port:
    start: 30
    end: 80
- path: bar
  port:
    exact: 443

This clearly makes the port option much more confusing than simply having a port option along with an optional portEnd option.

Another way to approach this is to allow different types for those options, for example the systemd option would then either accept a boolean or a string. I'd consider this the ugliest option and it also won't work so well with the port option, except if we'd parse it in the same way as in the CLI specification and accept either an integer or a string.

Yet another way would be to keep the port/portEnd and reject/rejectErrno as is but just change the fdName option to something like systemdFdName.


Given the following CLI invocation:

$ ip2unix -r port=80-443,systemd=foo \
          -r port=53,reject \
          -r port=6667,reject=EPERM \
          -r systemd \
          ...

... here are how the ideas above would look like in rule file format:

- port:
    start: 80
    end: 443
  systemd:
    enable: true
    fdName: foo
- port:
    exact: 53
  reject:
    enable: true
- port:
    exact: 6667
  reject:
    enable: true
    errno: EPERM
- systemd:
    enable: true
- port: 80-443
  systemd: foo
- port: 53
  reject: true
- port: 6667
  reject: EPERM
- systemd: true
- port: 80
  portEnd: 443
  systemd: true
  systemdFdName: foo
- port: 53
  reject: true
- port: 6667
  reject: true
  rejectErrno: EPERM
- systemd: true

Cc: @Profpatsch
Cc: @kyren for providing helpful input for designing the CLI specification format back then

The rule file options were defined earlier than the rule specification
options (via the -r command line flag) and when designing the rule
specification format, I tried to keep options as short as possible.

Unfortunately, this introduced inconsistencies between the naming of
options in both rule formats and the major offender is "socketPath" in
the rule file, where it is just "path" via the command line argument.

So when renaming, I didn't remove the old option but rather print a
deprecation warning instead to make sure we don't immediately break
things for users.

Signed-off-by: aszlig <aszlig@nix.build>
In the previous commit, we renamed the socketPath option to reduce
differences between rule specifications via command line and options in
the rule file.

Another option is "socketActivation", which is just called "systemd" in
the CLI specification.

The name/semantics of this option however might still be subject to
change, depending on how we are going to rename the fdName option,
because setting eg. fdName to "foo" is equivalent to "systemd=foo" in
the CLI specification.

One example where we'd need to touch the "systemd" option again is if
we'll end up with nested options like this:

  systemd:
    enable: true
    fdName: foo

Signed-off-by: aszlig <aszlig@nix.build>
@aszlig aszlig self-assigned this Sep 15, 2019
@aszlig

aszlig commented Sep 15, 2019

Copy link
Copy Markdown
Contributor Author

@qknight: I hope the title is now more clear ;-)

@Profpatsch

Copy link
Copy Markdown

Github has a (rather new) feature where you can open a PR as a Draft

@aszlig aszlig marked this pull request as draft May 28, 2020 02:33
@aszlig aszlig changed the title Unify CLI rule specification options with rule file options [DO NOT MERGE] Unify CLI rule specification options with rule file options May 28, 2020
@aszlig aszlig added this to the 3.0.0 milestone May 28, 2020
aszlig added a commit that referenced this pull request May 29, 2020
When compiling with address sanitiser enabled, we get the following
error during tests:

  ERROR: AddressSanitizer: stack-buffer-overflow on address
                           0x7fffffff61a0 at pc 0x00000041dd5d bp
                           0x7fffffff43f0 sp 0x7fffffff43e8
  READ of size 1 at 0x7fffffff61a0 thread T0
    #0 0x41dd5c in GlobPath::match_cclass(unsigned long*, char const&) ../src/globpath.cc:73
    #1 0x41f0d4 in GlobPath::match() ../src/globpath.cc:237
    #2 0x41f562 in globpath(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&,
                            std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ../src/globpath.cc:298
    #3 0x40e20c in main ../tests/unit/globpath.cc:24
    #4 0x7ffff720ad8a in __libc_start_main (/nix/store/bwzra330vib0ik4d3l8rq6gp6y2ah1fr-glibc-2.30/lib/libc.so.6+0x23d8a)
    #5 0x41ce79 in _start (/build/source/build/tests/unit/test_globpath+0x41ce79)

The reported issue (#15) has a similar output via Valgrind, but both
boil down to the same problem:

If the end square bracket ("]") is missing for a character class
pattern, the loop doesn't check whether we're overflowing the
std::string_view.

I've fixed this by adding a check which immediately discards the
character class as invalid whenever we hit the end of the pattern.

Signed-off-by: aszlig <aszlig@nix.build>
Reported-by: @etam
Fixes: #15
@aszlig aszlig closed this in e597e5c Jul 7, 2020
@aszlig aszlig deleted the rulefile-unification branch July 8, 2020 00:27
aszlig added a commit that referenced this pull request May 20, 2026
This fixes a long-standing issue on aarch64, which is caused by mimalloc
in Python 3 calling close() too early:

  #0  0x0000007ff75952ac in __pthread_kill_implementation () from .../lib/libc.so.6
  #1  0x0000007ff753c5b8 in raise () from .../lib/libc.so.6
  #2  0x0000007ff7525d38 in abort () from .../lib/libc.so.6
  #3  0x0000007ff7299818 in __gnu_cxx::__verbose_terminate_handler() () from .../lib/libstdc++.so.6
  #4  0x0000007ff729704c in __cxxabiv1::__terminate(void (*)()) () from .../lib/libstdc++.so.6
  #5  0x0000007ff728e49c in std::terminate() () from .../lib/libstdc++.so.6
  #6  0x0000007ff72973dc in __cxa_throw () from .../lib/libstdc++.so.6
  #7  0x0000007ff728ed7c in std::__throw_bad_array_new_length() () from .../lib/libstdc++.so.6
  #8  0x0000007ff7f10c54 in std::__new_allocator<std::__detail::_Hash_node_base*>::allocate (this=0x7fffffe890, __n=18446744073709551557) at .../include/c++/15.2.0/bits/new_allocator.h:139
  #9  0x0000007ff7f6e898 in std::allocator_traits<std::allocator<std::__detail::_Hash_node_base*> >::allocate (__a=..., __n=18446744073709551557)
      at .../include/c++/15.2.0/bits/alloc_traits.h:614
  #10 std::__detail::_Hashtable_alloc<std::allocator<std::__detail::_Hash_node<int, false> > >::_M_allocate_buckets (this=0x7ff7fb02a0 <all_fds>, __bkt_count=18446744073709551557)
      at .../include/c++/15.2.0/bits/hashtable_policy.h:1605
  #11 0x0000007ff7f6e500 in std::_Hashtable<int, int, std::allocator<int>, std::__detail::_Identity, std::equal_to<int>, std::hash<int>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, true, true> >::_M_allocate_buckets (this=0x7ff7fb02a0 <all_fds>, __bkt_count=18446744073709551557)
      at .../include/c++/15.2.0/bits/hashtable.h:413
  #12 0x0000007ff7f6d944 in std::_Hashtable<int, int, std::allocator<int>, std::__detail::_Identity, std::equal_to<int>, std::hash<int>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, true, true> >::_M_rehash (this=0x7ff7fb02a0 <all_fds>, __bkt_count=18446744073709551557) at .../include/c++/15.2.0/bits/hashtable.h:2754
  #13 0x0000007ff7f6c0b8 in std::_Hashtable<int, int, std::allocator<int>, std::__detail::_Identity, std::equal_to<int>, std::hash<int>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, true, true> >::_M_insert_unique_node (this=0x7ff7fb02a0 <all_fds>, __bkt=3, __code=3, __node=0x5555583580, __n_elt=1)
      at .../include/c++/15.2.0/bits/hashtable.h:2479
  #14 0x0000007ff7f6a034 in std::_Hashtable<int, int, std::allocator<int>, std::__detail::_Identity, std::equal_to<int>, std::hash<int>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, true, true> >::_M_emplace_uniq<int const&> (this=0x7ff7fb02a0 <all_fds>) at .../include/c++/15.2.0/bits/hashtable.h:2365
  #15 0x0000007ff7f685a8 in std::_Hashtable<int, int, std::allocator<int>, std::__detail::_Identity, std::equal_to<int>, std::hash<int>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, true, true> >::insert (this=0x7ff7fb02a0 <all_fds>, __v=@0x7fffffebb8: 3) at .../include/c++/15.2.0/bits/hashtable.h:1049
  #16 0x0000007ff7f66b60 in std::unordered_set<int, std::hash<int>, std::equal_to<int>, std::allocator<int> >::insert (this=0x7ff7fb02a0 <all_fds>, __x=@0x7fffffebb8: 3)
      at .../include/c++/15.2.0/bits/unordered_set.h:475
  #17 0x0000007ff7f64c80 in Systemd::init (rules=...) at ../src/systemd.cc:213
  #18 0x0000007ff7f14bd0 in init_rules () at ../src/preload.cc:77
  #19 0x0000007ff7f19edc in ip2unix_wrap_close (fd=4) at ../src/preload.cc:677
  #20 0x0000007ff78cf430 in mi_process_init.part () from .../lib/libpython3.13.so.1.0
  #21 0x0000007ff7894ee8 in _mi_process_init () from .../lib/libpython3.13.so.1.0
  #22 0x0000007ff7fc32c4 in call_init () from .../lib/ld-linux-aarch64.so.1
  #23 0x0000007ff7fc341c in _dl_init () from .../lib/ld-linux-aarch64.so.1
  #24 0x0000007ff7fdbea0 in _start () from .../lib/ld-linux-aarch64.so.1

Here the non-local all_fds variable is an unordered_set which hasn't yet
been initialised but is accessed because _mi_process_init has a lower
priority and thus is initialised *before* all_fds.

Since mi_process_init calls close(), an insert on the all_fds variable
is triggered by the systemd rule initialisation, which then fails
because all_fds hasn't been initialised yet.

The all_fds variable isn't the only one that's affected, so in order to
avoid wrapping all non-local variables in lazy initalisation wrappers,
we now short-circuit all the library calls that trigger the ruleset
initialisation machinery.

While at it, I also fixed a potential data race that could emerge in
multithreaded programs. While in practice most networking code is
initialised before actually spawning threads, it's still something that
could happen, so let's use std::call_once on rule initialisation.

Signed-off-by: aszlig <aszlig@nix.build>
aszlig added a commit that referenced this pull request May 20, 2026
This fixes a long-standing issue on aarch64, which is caused by mimalloc
in Python 3 calling close() too early:

  #0  0x0000007ff75952ac in __pthread_kill_implementation () from .../lib/libc.so.6
  #1  0x0000007ff753c5b8 in raise () from .../lib/libc.so.6
  #2  0x0000007ff7525d38 in abort () from .../lib/libc.so.6
  #3  0x0000007ff7299818 in __gnu_cxx::__verbose_terminate_handler() () from .../lib/libstdc++.so.6
  #4  0x0000007ff729704c in __cxxabiv1::__terminate(void (*)()) () from .../lib/libstdc++.so.6
  #5  0x0000007ff728e49c in std::terminate() () from .../lib/libstdc++.so.6
  #6  0x0000007ff72973dc in __cxa_throw () from .../lib/libstdc++.so.6
  #7  0x0000007ff728ed7c in std::__throw_bad_array_new_length() () from .../lib/libstdc++.so.6
  #8  0x0000007ff7f10c54 in std::__new_allocator<std::__detail::_Hash_node_base*>::allocate (this=0x7fffffe890, __n=18446744073709551557) at .../include/c++/15.2.0/bits/new_allocator.h:139
  #9  0x0000007ff7f6e898 in std::allocator_traits<std::allocator<std::__detail::_Hash_node_base*> >::allocate (__a=..., __n=18446744073709551557)
      at .../include/c++/15.2.0/bits/alloc_traits.h:614
  #10 std::__detail::_Hashtable_alloc<std::allocator<std::__detail::_Hash_node<int, false> > >::_M_allocate_buckets (this=0x7ff7fb02a0 <all_fds>, __bkt_count=18446744073709551557)
      at .../include/c++/15.2.0/bits/hashtable_policy.h:1605
  #11 0x0000007ff7f6e500 in std::_Hashtable<int, int, std::allocator<int>, std::__detail::_Identity, std::equal_to<int>, std::hash<int>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, true, true> >::_M_allocate_buckets (this=0x7ff7fb02a0 <all_fds>, __bkt_count=18446744073709551557)
      at .../include/c++/15.2.0/bits/hashtable.h:413
  #12 0x0000007ff7f6d944 in std::_Hashtable<int, int, std::allocator<int>, std::__detail::_Identity, std::equal_to<int>, std::hash<int>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, true, true> >::_M_rehash (this=0x7ff7fb02a0 <all_fds>, __bkt_count=18446744073709551557) at .../include/c++/15.2.0/bits/hashtable.h:2754
  #13 0x0000007ff7f6c0b8 in std::_Hashtable<int, int, std::allocator<int>, std::__detail::_Identity, std::equal_to<int>, std::hash<int>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, true, true> >::_M_insert_unique_node (this=0x7ff7fb02a0 <all_fds>, __bkt=3, __code=3, __node=0x5555583580, __n_elt=1)
      at .../include/c++/15.2.0/bits/hashtable.h:2479
  #14 0x0000007ff7f6a034 in std::_Hashtable<int, int, std::allocator<int>, std::__detail::_Identity, std::equal_to<int>, std::hash<int>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, true, true> >::_M_emplace_uniq<int const&> (this=0x7ff7fb02a0 <all_fds>) at .../include/c++/15.2.0/bits/hashtable.h:2365
  #15 0x0000007ff7f685a8 in std::_Hashtable<int, int, std::allocator<int>, std::__detail::_Identity, std::equal_to<int>, std::hash<int>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, true, true> >::insert (this=0x7ff7fb02a0 <all_fds>, __v=@0x7fffffebb8: 3) at .../include/c++/15.2.0/bits/hashtable.h:1049
  #16 0x0000007ff7f66b60 in std::unordered_set<int, std::hash<int>, std::equal_to<int>, std::allocator<int> >::insert (this=0x7ff7fb02a0 <all_fds>, __x=@0x7fffffebb8: 3)
      at .../include/c++/15.2.0/bits/unordered_set.h:475
  #17 0x0000007ff7f64c80 in Systemd::init (rules=...) at ../src/systemd.cc:213
  #18 0x0000007ff7f14bd0 in init_rules () at ../src/preload.cc:77
  #19 0x0000007ff7f19edc in ip2unix_wrap_close (fd=4) at ../src/preload.cc:677
  #20 0x0000007ff78cf430 in mi_process_init.part () from .../lib/libpython3.13.so.1.0
  #21 0x0000007ff7894ee8 in _mi_process_init () from .../lib/libpython3.13.so.1.0
  #22 0x0000007ff7fc32c4 in call_init () from .../lib/ld-linux-aarch64.so.1
  #23 0x0000007ff7fc341c in _dl_init () from .../lib/ld-linux-aarch64.so.1
  #24 0x0000007ff7fdbea0 in _start () from .../lib/ld-linux-aarch64.so.1

Here the non-local all_fds variable is an unordered_set which hasn't yet
been initialised but is accessed because _mi_process_init has a lower
priority and thus is initialised *before* all_fds.

Since mi_process_init calls close(), an insert on the all_fds variable
is triggered by the systemd rule initialisation, which then fails
because all_fds hasn't been initialised yet.

The all_fds variable isn't the only one that's affected, so in order to
avoid wrapping all non-local variables in lazy initalisation wrappers,
we now short-circuit all the library calls that trigger the ruleset
initialisation machinery.

While at it, I also fixed a potential data race that could emerge in
multithreaded programs. While in practice most networking code is
initialised before actually spawning threads, it's still something that
could happen, so let's use std::call_once on rule initialisation.

Signed-off-by: aszlig <aszlig@nix.build>
Tested-by: Adam Mizerski <adam@mizerski.pl>
Closes: #40
Fixes: #37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants