[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4 3/4] xenstore: rework of transaction handling



On 30/03/17 17:41, Wei Liu wrote:
> On Thu, Mar 30, 2017 at 03:11:48PM +0200, Juergen Gross wrote:
> [...]
>> +
>> +/*
>> + * A node has been accessed.
>> + *
>> + * Modifying accesses (write, delete) always update the generation (global 
>> and
>> + * node->generation).
>> + *
>> + * Accesses in a transaction will be added to the list of accessed nodes
>> + * if not already done. Read type accesses will copy the node to the
>> + * transaction specific data base part, write type accesses go there
>> + * anyway.
>> + *
>> + * If not NULL, key will be supplied with name and length of name of the 
>> node
>> + * to be accessed in the data base.
>> + */
>> +int access_node(struct connection *conn, struct node *node,
>> +            enum node_access_type type, TDB_DATA *key)
>> +{
>> +    struct accessed_node *i = NULL;
>>      struct transaction *trans;
>> +    TDB_DATA local_key;
>> +    const char *trans_name = NULL;
>> +    int ret;
>> +    bool introduce = false;
>>  
>> -    if (!conn || !conn->transaction) {
>> +    if (type != NODE_ACCESS_READ) {
>>              /* They're changing the global database. */
> 
> This comment should be moved below.

Right.

> 
>>              node->generation = generation++;
>> -            return;
>> +            if (conn && !conn->transaction)
>> +                    wrl_apply_debit_direct(conn);
>> +    }
>> +
>> +    if (!conn || !conn->transaction) {
>> +            if (key)
>> +                    set_tdb_key(node->name, key);
>> +            return 0;
>>      }
>>  
>>      trans = conn->transaction;
>>  
>> -    node->generation = generation + trans->trans_gen++;
>> +    trans_name = transaction_get_node_name(node, trans, node->name);
>> +    if (!trans_name)
>> +            goto nomem;
>>  
>> -    list_for_each_entry(i, &trans->changes, list) {
>> -            if (streq(i->node, node->name)) {
>> -                    if (recurse)
>> -                            i->recurse = recurse;
>> -                    return;
>> -            }
>> -    }
>> -
>> -    i = talloc(trans, struct changed_node);
>> +    i = find_accessed_node(trans, node->name);
>>      if (!i) {
>> -            /* All we can do is let the transaction fail. */
>> -            generation++;
>> -            return;
>> +            i = talloc_zero(trans, struct accessed_node);
>> +            if (!i)
>> +                    goto nomem;
>> +            i->node = talloc_strdup(i, node->name);
>> +            if (!i->node)
>> +                    goto nomem;
>> +
>> +            introduce = true;
>> +            i->ta_node = false;
>> +
>> +            /* We have to verify read nodes only if we didn't write them. */
>> +            if (type == NODE_ACCESS_READ) {
>> +                    i->generation = node->generation;
>> +                    i->check_gen = true;
>> +                    if (node->generation != NO_GENERATION) {
>> +                            set_tdb_key(trans_name, &local_key);
>> +                            ret = write_node_raw(conn, &local_key, node);
>> +                            if (ret)
>> +                                    goto err;
>> +                            i->ta_node = true;
>> +                    }
>> +            }
> 
> I think the current code structure is a bit confusing.
> 
> For write types, the call to write_node_raw is outside of this function
> but for read type it is inside this function.

This is due to the different purpose of write_node_raw() in both
cases:

In the read case we write an _additional_ node into the transaction,
while in the write case it is just the user's operation.

> 
>> +            list_add_tail(&i->list, &trans->accessed);
>>      }
>> -    i->node = talloc_strdup(i, node->name);
>> -    if (!i->node) {
>> -            /* All we can do is let the transaction fail. */
>> -            generation++;
>> +
>> +    if (type != NODE_ACCESS_READ)
>> +            i->modified = true;
>> +
>> +    if (introduce && type == NODE_ACCESS_DELETE)
>> +            /* Nothing to delete. */
>> +            return -1;
>> +
>> +    if (key) {
>> +            set_tdb_key(trans_name, key);
>> +            if (type == NODE_ACCESS_WRITE)
>> +                    i->ta_node = true;
>> +            if (type == NODE_ACCESS_DELETE)
>> +                    i->ta_node = false;
> 
> The same caveat made me wonder if ta_node was really what it meant to
> be because I couldn't find where write_node_raw was.
> 
> Can I suggest you make them consistent? Either lift the write_node_raw
> for read to read_node or push down the write_node_raw in delete_node and
> write_node here?
> 
> If I misunderstand how this works, please tell me...

As I already said: in the read case writing the node is depending
on the context and lifting it up would seem rather strange: why
should reading a node contain a write_node_raw() call?

In the write case I believe we should keep write_node_raw() where
it belongs: in the appropriate function modifying the node.

All actions in access_node() are transaction specific and _additional_
in the transaction case.

> 
>> +    }
>> +
>> +    return 0;
>> +
>> +nomem:
>> +    ret = ENOMEM;
>> +err:
>> +    talloc_free((void *)trans_name);
>> +    talloc_free(i);
>> +    trans->fail = true;
>> +    errno = ret;
>> +    return ret;
>> +}
>> +
>> +/*
>> + * Finalize transaction:
>> + * Walk through accessed nodes and check generation against global data.
>> + * If all entries match, read the transaction entries and write them without
>> + * transaction prepended. Delete all transaction specific nodes in the data
>> + * base.
>> + */
> [...]
>> @@ -269,7 +525,7 @@ void transaction_entry_inc(struct transaction *trans, 
>> unsigned int domid)
>>      d = talloc(trans, struct changed_domain);
>>      if (!d) {
>>              /* Let the transaction fail. */
>> -            generation++;
>> +            trans->fail = true;
>>              return;
>>      }
>>      d->domid = domid;
>> @@ -290,7 +546,7 @@ void transaction_entry_dec(struct transaction *trans, 
>> unsigned int domid)
>>      d = talloc(trans, struct changed_domain);
>>      if (!d) {
>>              /* Let the transaction fail. */
>> -            generation++;
>> +            trans->fail = true;
>>              return;
>>      }
>>      d->domid = domid;
>> @@ -313,6 +569,29 @@ void conn_delete_all_transactions(struct connection 
>> *conn)
>>      conn->transaction_started = 0;
>>  }
>>  
>> +int check_transactions(int (*func)(void *, const char *), void *par)
>> +{
> 
> I think you can make the func take hashtable * and the second parameter
> hashtable *.
> 
> You can then avoid changing remember_string.
> 
> Or even better, you don't need to take a callback function at all
> because AFAICT only remember_string is called.

Right. Will change. IIRC the main reason to do it this way was to
avoid making remember_string global.

> 
>> +    struct connection *conn;
>> +    struct transaction *trans;
>> +    char *tname;
>> +
>> +    list_for_each_entry(conn, &connections, list) {
>> +            list_for_each_entry(trans, &conn->transaction_list, list) {
>> +                    tname = talloc_asprintf(trans, "%"PRIu64,
>> +                                            trans->generation);
>> +                    if (!tname)
>> +                            return ENOMEM;
>> +
>> +                    if (!func(par, tname))
>> +                            return ENOMEM;
>> +
> 
> I guess a bigger question is why calling remember_string here would
> constitute a check on all transactions.

That's a valid question. The iteration over trans->accessed is missing.

> Sorry this is taking a bit long. I am still struggling with the code. I
> haven't got around to check if the code matches the algorithm yet.
> I expect a few iterations are needed.


Juergen


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.