Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected conversion for _ in filename() #60

Closed
leiyu-bytedance opened this issue Apr 11, 2021 · 18 comments
Closed

Unexpected conversion for _ in filename() #60

leiyu-bytedance opened this issue Apr 11, 2021 · 18 comments
Assignees

Comments

@leiyu-bytedance
Copy link
Contributor

The filename() is expected to return the DBus object path's "filename", however, it converts underscore _ unexpectedly.

E.g. the object "/xyz/openbmc_project/software/bios_active"'s filename() shall be bios_active, but sdbusplus::message::object_path::filename() converts it to bios\xACtive.

This issue is found in g220a's HW CI after rebasing the code to upstream 06dc92d6f.

@leiyu-bytedance
Copy link
Contributor Author

sd_bus_path_decode() is decoding _ac to a special character, is it a bug in sd_bus_path_decode()?

@wak-google
Copy link
Contributor

Yeah, we can't apply sd_bus_path_encode() / decode() to all paths as it changes the meaning of all _

@leiyu-bytedance
Copy link
Contributor Author

@edtanous

@edtanous
Copy link
Contributor

edtanous commented Apr 11, 2021

Yep, this is a problem; We should probably discuss the solution on discord, because I'm on the fence with what to do about it. The issue is that underscore has special meaning in dbus paths when they're encoded as the precursor to an escape sequence. Unfortunately, we also seem to use underscores in the example above where they're not escaped. In theory bios_active should've been escaped as bios_5Factive, which would've caused this to decode properly. Instead, it's being put on dbus as-is, which is decoding to the hex code for 0xAC, which is extended ascii for a code that we don't support.

A quick look through the services on my desktop shows that most services sidestep this by creating camel cased path identifiers for well known dbus paths, although things like org.freedesktop.systemd1, which need to handle freeform strings definitely do escape underscores.

The options I see going forward are:

  1. Make sure all paths that want to use underscore are escaped properly using the operator/ apis. This might be an isolated, or relatively isolated case, in which case this just needs a patch to the service to fix this.
  2. For any well known paths, like bios_active, move to the camel cased version, like most other systemd services do.
  3. Remove all reliance on well known dbus paths, in which case, the failed escaping shouldn't matter.
  4. Revert the recent patches, and simply disallow anything but ascii characters on dbus in OpenBMC (which has impacts on what names you can use in sensors for example).
  5. Something else?

Option 2 would be my preference.

PS, can you point out the code that's actually broken here? Ie, what API is using filename() and expecting to get bios_active back out? I didn't know there was widespread usage of filename() yet, and the only usage of bios_active that I'm able to find is in intel-ipmi-oem, but I'm unable to find the backend implementation for it. Can you give me some pointers to what code you're looking at?

https://github.com/openbmc/intel-ipmi-oem/blob/9420416abe06d943301ab304044ed8001d49e315/include/oemcommands.hpp#L165

@leiyu-bytedance
Copy link
Contributor Author

PS, can you point out the code that's actually broken here? Ie, what API is using filename() and expecting to get bios_active back out? I didn't know there was widespread usage of filename() yet, and the only usage of bios_active that I'm able to find is in intel-ipmi-oem, but I'm unable to find the backend implementation for it. Can you give me some pointers to what code you're looking at?

It is the bmcweb openbmc/bmcweb@2dfd18ef that changes the logic of getting the software from manually parsing the string to invoking filename().
The filename() was returning bios_active before the sdbusplus change's fa0fa3b, and it breaks the string after fa0fa3b

@edtanous
Copy link
Contributor

edtanous commented Apr 12, 2021

PS, can you point out the code that's actually broken here? Ie, what API is using filename() and expecting to get bios_active back out? I didn't know there was widespread usage of filename() yet, and the only usage of bios_active that I'm able to find is in intel-ipmi-oem, but I'm unable to find the backend implementation for it. Can you give me some pointers to what code you're looking at?

It is the bmcweb openbmc/bmcweb@2dfd18e that changes the logic of getting the software from manually parsing the string to invoking filename().
The filename() was returning bios_active before the sdbusplus change's fa0fa3b, and it breaks the string after fa0fa3b

What is the backend code that is producing an object called "bios_active" for your system? That will need to be corrected to either publish BiosActive or bios_5Factive.

Also, which bmcweb endpoint are you testing with? The /xyz apis? I don't see any specific call outs for bios_active in the redfish endpoints.

@leiyu-bytedance
Copy link
Contributor Author

The bios_active comes from openbmc/phosphor-bmc-code-mgmt@6e9fb1d, but its origin is from Intel-BMC: https://github.com/search?q=org%3AIntel-BMC+bios_active&type=code

@williamspatrick
Copy link
Member

Unfortunately, we've used snake_case almost everywhere as our preferred dbus naming scheme. Even our TLDs are snake_case (org/open_power, xyz/openbmc_project). I'm sort of leaning towards we should rethink how we do escaping and we shouldn't just take in the systemd method.

Looking at my own system, org.freedesktop.systemd1 and org.freedesktop.network1 use the systemd manner of escaping but other services do not (org.freedesktop.login1, org.freedesktop.UDisks2, org.freedesktop.ColorManager). The really odd thing is network1 actually uses a mixture and login1 is even a systemd-provided service.

$ busctl tree org.freedesktop.login1   
└─/org
  └─/org/freedesktop
    ├─/org/freedesktop/LogControl1
    └─/org/freedesktop/login1
      ├─/org/freedesktop/login1/seat
      │ ├─/org/freedesktop/login1/seat/auto
      │ └─/org/freedesktop/login1/seat/seat0
      ├─/org/freedesktop/login1/session
      │ ├─/org/freedesktop/login1/session/_32
      │ ├─/org/freedesktop/login1/session/_34
      │ └─/org/freedesktop/login1/session/auto
      └─/org/freedesktop/login1/user
        ├─/org/freedesktop/login1/user/_1000
        └─/org/freedesktop/login1/user/self
$ busctl tree org.freedesktop.network1
└─/org
  └─/org/freedesktop
    ├─/org/freedesktop/LogControl1
    └─/org/freedesktop/network1
      ├─/org/freedesktop/network1/link
      │ ├─/org/freedesktop/network1/link/_31
      │ ├─/org/freedesktop/network1/link/_32
      │ └─/org/freedesktop/network1/link/_33
      └─/org/freedesktop/network1/network
        ├─/org/freedesktop/network1/network/_330_2diphone
        ├─/org/freedesktop/network1/network/_350_2ddhcp
        ├─/org/freedesktop/network1/network/_380_2dcontainer_2dve
        ├─/org/freedesktop/network1/network/_380_2dcontainer_2dvz
        ├─/org/freedesktop/network1/network/_380_2dvm_2dvt
        └─/org/freedesktop/network1/network/_380_2dwifi_2dadhoc

Notice that in network the link hierarchy doesn't use the escape but the network does.

@williamspatrick
Copy link
Member

Alright, correction. network1 does use the escape. ascii 31 = '1' and that is the name of the link. login1 uses it for session maybe, but not user? My uid is 1000, so _1000 is not escaped. But the session _32 has an Id field of "2", so that appears that it does use the escaping.

(At first I couldn't figure out why they decided to escape integers like that, since integers are valid and there are some odd ones like _330_2diphone which unescapes to 30-iphone. I guess they want to make C-like variable strings with their escaping and you can't start a C variable with an integer?)

@edtanous
Copy link
Contributor

I'm sort of leaning towards we should rethink how we do escaping and we shouldn't just take in the systemd method.

Could you elaborate a little bit on what you're thinking? Given the number of allowed characters in dbus paths, there aren't that many options that are even possible.

At first I couldn't figure out why they decided to escape integers like that

If I remember right, you can't start a dbus path component with an integer (or something like that, the details are fuzzy).

@williamspatrick
Copy link
Member

If I remember right, you can't start a dbus path component with an integer (or something like that, the details are fuzzy).

FWIW, object paths segments can start with integers. Bus Names and Members have this prohibition but not objects.

https://dbus.freedesktop.org/doc/dbus-specification.html#message-protocol-marshaling-object-path

@jonathan-doman
Copy link
Contributor

For another data point, this issue is breaking a lot of stuff in intel platforms due to the chassis board name containing an underscore which is not escaped by EntityManager. The board name is decoded by bmcweb (into a broken looking name) when generating the ChassisCollection links, but not encoded again when following that link to get the Chassis resource, so the Redfish Chassis is inaccessible.

I don't really follow why we should want to encode all path names. Underscores are perfectly OK in D-Bus paths, and we happen to be using it in a lot of places (since /xyz/openbmc_project/ suggested that we were supposed to use snake_case), for various functional endpoints and also in sensor names, etc. I would prefer to have the encoding be optional/contextual - although I realize it's harder to manage piecemeal conventions like that.

@edtanous
Copy link
Contributor

For another data point, this issue is breaking a lot of stuff in intel platforms due to the chassis board name containing an underscore which is not escaped by EntityManager. The board name is decoded by bmcweb (into a broken looking name) when generating the ChassisCollection links, but not encoded again when following that link to get the Chassis resource, so the Redfish Chassis is inaccessible.

This sounds like a bug. I can look into it. bmcweb should be encoding and decoding at every step.

I don't really follow why we should want to encode all path names.

Because there are things we want to represent on dbus that aren't just a-z0-9_. In this specific case, there's a desire to have dashes in sensor names.

Underscores are perfectly OK in D-Bus paths, and we happen to be using it in a lot of places (since /xyz/openbmc_project/ suggested that we were supposed to use snake_case), for various functional endpoints and also in sensor names, etc. I would prefer to have the encoding be optional/contextual - although I realize it's harder to manage piecemeal conventions like that.

There's a patchset working on exactly what you want. The initial half-done stab at this is here:
https://gerrit.openbmc-project.xyz/c/openbmc/sdbusplus/+/42512
We also talked about reverting the last encoding patch, which I have yet to push the review for, but I will.

@Howitzer105mm
Copy link

Ed,
I'm late to the discussion.
In include/utils/collection.hpp:
sdbusplus::message::object_path path(object);
std::string leaf = path.filename();

The invocation of the sdbusplus::message::object_path is the point at which I found the D-Bus Chassis path getting mangled. The object has AC_Baseboard on the way in, and "path" ends up with AC\ufffdseboard coming out.
I submit this since you were only finding the filename() in the ipmi code.

@edtanous
Copy link
Contributor

Yep, you're right, filename() is "mangling" the path by applying escaping rules. James invented his own escaping rules in entity-manager when he found this issue initially (all non-encodable characters become underscores) which doesn't really cover the full range, and causes entity-manager to crash in weird ways if you mess up the json even a little. That's largely what I'm trying to fix with the encoding and decoding.

https://gerrit.openbmc-project.xyz/c/openbmc/bmcweb/+/42628 should makes the Redfish directory structure traversal again, although the names will still be weird, but considering nobody should be using IDs directly, that's less of a concern (something I'm still wanting to fix with the sdbusplus patch though. I need to figure out how to test it (none of my systems put underscores in chassis names).

@Howitzer105mm
Copy link

I'd like to understand a bit better...
The github readme indicates sdbusplus is for interfacing to D-Bus using systemd code as the basis.
Is the current situation a result of using the systemd code as the base that is causing the D-Bus path/intf/object parsing issue?
Am I understanding the long term plan is to require changes to existing D-Bus elements to fit systemd rules?
You mention in 42512 that you expect it to be a transitory correction, so I'd like clarification of the long term correction.

@Howitzer105mm
Copy link

In my first submission to this thread I was pointing out that the name was mangled by sdbusplus::message::object_path path(object); before filename() was even called. It was the ::object_path constructor doing the damage to filename() in my investigation of the issue.

@edtanous
Copy link
Contributor

I would like to see us move to to systemd naming conventions, which rely on the underscore character and hex codes to escape paths. I cannot imagine this is everyone's direction, and I'm sure there will be some differing of opinions, but it's something we should definitely discuss. We've already had issues with daemons that have to interact with systemd services having to do some kind of manual escaping by hand (also getting it wrong). Also, things like entity-manager invented their own escaping mechanisms (mostly because we didn't know any sort of standard existed at the time). This is causing issues where certain configs can crash entity-manager because of a bad name in a config file, which is unfortunate because EM is supposed to have a user facing dbus interface.

bradbishop pushed a commit to openbmc/bmcweb that referenced this issue May 4, 2021
A common anti pattern is using endswith or contains.  This got
compounded recently because of openbmc/sdbusplus#60
where filename now contains decoding logic.

This at least should fix the issue to make the Redfish tree walkable
again.

Tested:
Loaded up a WFP_Baseboard on qemu, and verified that
/redfish/v1/Chassis/WFP_Baseboard traversed correctly.

Signed-off-by: Ed Tanous <edtanous@google.com>
Change-Id: I55cc389a773e4b594668a4bca466efc33cc9cb7c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants