OpRequest Flow in RADOS
TweetThe Messenger
handles connections and generic messenges. A message will be
dispatched to any registered dispatchers via the ms_dispatch
virtual method
on the Dispatcher
interface. The OSD
class implements the Dispatcher
interface.
Update 06/08/14: this post is out of date with the current source tree. See OSD Request Processing Latency
There are two high-level asynchronous traces described below. The first is the process of receiving, preparing, and queueing a request. The second is from the perspective of separate worker threads that dequeue requests to be processed.
Message Dispatch and Req Enqueue
The trace begins when a message is dispatched to the OSD. At a high-level
- bool OSD::ms_dispatch(Message *m)
- src/osd/OSD.cc:4720
There are two paths that can be taken, both of which will arrive at
OSD::dispatch_op
.
void OSD::_dispatch(Message *m)
- Construct a new OpRequest
- src/osd/OSD.cc:4937
void OSD::do_waiters()
- Grab an existing OpRequest
- src/osd/OSD.cc:4840
Both _dispatch
and do_waiters
will then process a request.
void OSD::dispatch_op(OpRequestRef op)
- src/osd/OSD.cc:4857
void OSD::handle_op(OpRequestRef op)
- src/osd/OSD.cc:7352
void OSD::enqueue_op(PG *pg, OpRequestRef op)
- src/osd/OSD.cc:7546
void PG::queue_op(OpRequestRef op)
- src/osd/PG.cc:1707
The request is now living on a queue waiting to be picked up by a worker.
Request Processing
struct OpWQ: public ThreadPool::WorkQueueVal
, PGRef > - src/osd/OSD.h:1101
void OSD::OpWQ::_process(PGRef pg, ThreadPool::TPHandle &handle)
- src/osd/OSD.cc:7604
void OSD::dequeue_op(PGRef pg, OpRequestRef op, ThreadPool::TPHandle &handle)
- src/osd/OSD.cc:7643
void ReplicatedPG::do_request(OpRequestRef op, ThreadPool::TPHandle &handle)
- src/osd/ReplicatedPG.cc:1080
void ReplicatedPG::do_op(OpRequestRef op)
- src/osd/ReplicatedPG.cc:1191
void ReplicatedPG::execute_ctx(OpContext *ctx)
- src/osd/ReplicatedPG.cc:1706
The following sub-trace shows the patch taken to the actual logic behind a
RADOS client write
operation. All other client operations can be found down
this patch as well. For instance, CEPH_OSD_OP_WRITE
is sibling to all other
client operations in a large switch
statement in do_osd_ops
.
int ReplicatedPG::prepare_transaction(OpContext *ctx)
- src/osd/ReplicatedPG.cc:5055
int ReplicatedPG::doosdops(OpContext *ctx, vector
& ops) - src/osd/ReplicatedPG.cc:2921
case CEPH_OSD_OP_WRITE
- src/osd/ReplicatedPG.cc:3650
The accumulated transaction is submitted in issue_repop
that will then call
submit_transaction
on the configured PGBackend (e.g. replication or erasure
coding). The backend will communicate with replicas as well as run the
transaction against the local object store.
void ReplicatedPG::issuerepop(RepGather *repop, utimet now)
- src/osd/ReplicatedPG.cc:6660
virtual void submit_transaction(
- src/osd/PGBackend.h:490