3.3 Shipping of operation log

Next: 3.4 Re-execution of user Up: 3. Design and Implementation Previous: 3.2 Logging of user

Subsections

3.3 Shipping of operation log

3.3.1 Shipping mechanism

The reintegrator, which is a user-level thread within Venus, manages update propagation. It periodically selects several records from the head of the CML and ships them to the servers. For records with no user-operation information attached, the reintegrator uses value shipping and makes a ViceReintegrate remote procedure call (RPC) to the server. The server, when processing the RPC, back-fetches the related container files from the client. If the reply of the RPC indicates success, the reintegrator will locally commit the updates. Local commitment of updates is the final step of successful update propagation, and includes updating the states of relevant objects, such as version vectors and dirty bits, and truncating the CML.

If user-operation information is available for a record, the reintegrator will attempt operation shipping first. All the records associated with the same user operation will be operation shipped altogether. The reintegrator selects the records, packs the operation log, and makes a UserOpPropagate RPC to the surrogate. If the reply indicates success, the reintegrator will locally commit the updates. However, if the reply indicates failure, the reintegrator will set a flag in each of the records indicating that it has tried and failed propagation by operation shipping. These records will then be value shipped.

3.3.2 Cost model

The current version of our prototype attempts operation shipping for a record whenever there is user-operation information available. This static approach implicitly assumes that the connectivity between a mobile client and its servers is always weak. In real life, a mobile client may have strong connectivity occasionally. During that time, as explained in the following paragraphs, value shipping is more efficient than operation shipping. We plan to enhance our prototype so that mobile clients dynamically decide whether they should use operation shipping or value shipping.

Our cost model compares the costs of value shipping with that of operation shipping. For each case, there are two different costs involved: network traffic and elapsed time.

For value shipping, assuming the overhead is negligible, the network traffic is the total length L of the updated files, and the elapsed time is T_v = L/B_c, where B_c is the bandwidth of the network connecting the client to the server.

For operation shipping, the network traffic is the length of the operation log, L_op, and the elapsed time is T_op. The latter is composed of four components: (1) the time needed to ship the operation log ( L_op/B_c), (2) the time needed for re-executing the operation (E), (3) the time needed for additional computational overhead (H_op) such as computing checksum information and encoding and decoding of forward-error-correction codes, and (4) the time needed to ship the updated files to the servers. There are two cases for the last component. If the re-execution passes the validation (accepted), the updated files will be shipped from the surrogate (the time cost will be L/B_s, where B_s is the bandwidth of the network connecting the surrogate to the server); if the re-execution fails the validation, the updated file will be shipped from the client (the time cost will be L/B_c). The following equation summarizes the time costs involved:

$\displaystyle T_{op} = \left\{ \begin{array}{ll} L_{op}/B_c + E + H_{op} + L/B_... ...ed} \\ L_{op}/B_c + E + H_{op} + L/B_c & \mbox{if rejected} \end{array}\right.$

(1)

Therefore, operation shipping is more favorable than value shipping only in certain condition. Operation shipping saves network traffic if the operation log is more compact than the updated files ( L_op < L). Also, it speeds up the update propagation ( T_op < T_v) if the following five conditions are true: (1) the re-execution is accepted, (2) the operation log is compact ( $L_{op} \ll L$ ), (3) the re-execution is fast ( $E \ll L/B_c$ ), (4) the time needed for additional computational overheads is small ( $H_{op} \ll L/B_c$ ), and (5) the surrogate has a much better network connectivity than the client ( $B_s \gg B_c$ ).

Next: 3.4 Re-execution of user Up: 3. Design and Implementation Previous: 3.2 Logging of user