- in depth docs about some merge() tips

- docs about backref cascade
- Another new flag on relationship(), cascade_backrefs,
disables the "save-update" cascade when the event was
initiated on the "reverse" side of a bidirectional
relationship.   This is a cleaner behavior so that
many-to-ones can be set on a transient object without
it getting sucked into the child object's session,
while still allowing the forward collection to
cascade.   We *might* default this to False in 0.7.
This commit is contained in:
Mike Bayer
2010-09-22 14:22:16 -04:00
parent 7b8b23b427
commit eae4de02a9
8 changed files with 295 additions and 31 deletions
+10 -1
View File
@@ -73,7 +73,16 @@ CHANGES
object is loaded, so backrefs aren't available until
after a flush. The flag is only intended for very
specific use cases.
- Another new flag on relationship(), cascade_backrefs,
disables the "save-update" cascade when the event was
initiated on the "reverse" side of a bidirectional
relationship. This is a cleaner behavior so that
many-to-ones can be set on a transient object without
it getting sucked into the child object's session,
while still allowing the forward collection to
cascade. We *might* default this to False in 0.7.
- Slight improvement to the behavior of
"passive_updates=False" when placed only on the
many-to-one side of a relationship; documentation has
+196 -25
View File
@@ -346,6 +346,8 @@ The :func:`~sqlalchemy.orm.session.Session.add` operation **cascades** along
the ``save-update`` cascade. For more details see the section
:ref:`unitofwork_cascades`.
.. _unitofwork_merging:
Merging
-------
@@ -358,17 +360,17 @@ follows::
When given an instance, it follows these steps:
* It examines the primary key of the instance. If it's present, it attempts
to load an instance with that primary key (or pulls from the local
identity map).
* If there's no primary key on the given instance, or the given primary key
does not exist in the database, a new instance is created.
* The state of the given instance is then copied onto the located/newly
created instance.
* The operation is cascaded to associated child items along the ``merge``
cascade. Note that all changes present on the given instance, including
changes to collections, are merged.
* The new instance is returned.
* It examines the primary key of the instance. If it's present, it attempts
to load an instance with that primary key (or pulls from the local
identity map).
* If there's no primary key on the given instance, or the given primary key
does not exist in the database, a new instance is created.
* The state of the given instance is then copied onto the located/newly
created instance.
* The operation is cascaded to associated child items along the ``merge``
cascade. Note that all changes present on the given instance, including
changes to collections, are merged.
* The new instance is returned.
With :func:`~sqlalchemy.orm.session.Session.merge`, the given instance is not
placed within the session, and can be associated with a different session or
@@ -377,19 +379,22 @@ taking the state of any kind of object structure without regard for its
origins or current session associations and placing that state within a
session. Here's two examples:
* An application which reads an object structure from a file and wishes to
save it to the database might parse the file, build up the structure, and
then use :func:`~sqlalchemy.orm.session.Session.merge` to save it to the
database, ensuring that the data within the file is used to formulate the
primary key of each element of the structure. Later, when the file has
changed, the same process can be re-run, producing a slightly different
object structure, which can then be ``merged`` in again, and the
:class:`~sqlalchemy.orm.session.Session` will automatically update the
database to reflect those changes.
* A web application stores mapped entities within an HTTP session object.
When each request starts up, the serialized data can be merged into the
session, so that the original entity may be safely shared among requests
and threads.
* An application which reads an object structure from a file and wishes to
save it to the database might parse the file, build up the
structure, and then use
:func:`~sqlalchemy.orm.session.Session.merge` to save it
to the database, ensuring that the data within the file is
used to formulate the primary key of each element of the
structure. Later, when the file has changed, the same
process can be re-run, producing a slightly different
object structure, which can then be ``merged`` in again,
and the :class:`~sqlalchemy.orm.session.Session` will
automatically update the database to reflect those
changes.
* A web application stores mapped entities within an HTTP session object.
When each request starts up, the serialized data can be
merged into the session, so that the original entity may
be safely shared among requests and threads.
:func:`~sqlalchemy.orm.session.Session.merge` is frequently used by
applications which implement their own second level caches. This refers to an
@@ -406,6 +411,133 @@ all of its children may not contain any pending changes, and it's also of
course possible that newer information in the database will not be present on
the merged object, since no load is issued.
Merge Tips
~~~~~~~~~~
:meth:`~.Session.merge` is an extremely useful method for many purposes. However,
it deals with the intricate border between objects that are transient/detached and
those that are persistent, as well as the automated transferrence of state.
The wide variety of scenarios that can present themselves here often require a
more careful approach to the state of objects. Common problems with merge usually involve
some unexpected state regarding the object being passed to :meth:`~.Session.merge`.
Lets use the canonical example of the User and Address objects::
class User(Base):
__tablename__ = 'user'
id = Column(Integer, primary_key=True)
name = Column(String(50), nullable=False)
addresses = relationship("Address", backref="user")
class Address(Base):
__tablename__ = 'address'
id = Column(Integer, primary_key=True)
email_address = Column(String(50), nullable=False)
user_id = Column(Integer, ForeignKey('user.id'), nullable=False)
Assume a ``User`` object with one ``Address``, already persistent::
>>> u1 = User(name='ed', addresses=[Address(email_address='ed@ed.com')])
>>> session.add(u1)
>>> session.commit()
We now create ``a1``, an object outside the session, which we'd like
to merge on top of the existing ``Address``::
>>> existing_a1 = u1.addresses[0]
>>> a1 = Address(id=existing_a1.id)
A surprise would occur if we said this::
>>> a1.user = u1
>>> a1 = session.merge(a1)
>>> session.commit()
sqlalchemy.orm.exc.FlushError: New instance <Address at 0x1298f50>
with identity key (<class '__main__.Address'>, (1,)) conflicts with
persistent instance <Address at 0x12a25d0>
Why is that ? We weren't careful with our cascades. The assignment
of ``a1.user`` to a persistent object cascaded to the backref of ``User.addresses``
and made our ``a1`` object pending, as though we had added it. Now we have
*two* ``Address`` objects in the session::
>>> a1 = Address()
>>> a1.user = u1
>>> a1 in session
True
>>> existing_a1 in session
True
>>> a1 is existing_a1
False
Above, our ``a1`` is already pending in the session. The
subsequent :meth:`~.Session.merge` operation essentially
does nothing. Cascade can be configured via the ``cascade``
option on :func:`.relationship`, although in this case it
would mean removing the ``save-update`` cascade from the
``User.addresses`` relationship - and usually, that behavior
is extremely convenient. The solution here would usually be to not assign
``a1.user`` to an object already persistent in the target
session.
Note that a new :func:`.relationship` option introduced in 0.6.5,
``cascade_backrefs=False``, will also prevent the ``Address`` from
being added to the session via the ``a1.user = u1`` assignment.
Further detail on cascade operation is at :ref:`unitofwork_cascades`.
Another example of unexpected state::
>>> a1 = Address(id=existing_a1.id, user_id=u1.id)
>>> assert a1.user is None
>>> True
>>> a1 = session.merge(a1)
>>> session.commit()
sqlalchemy.exc.IntegrityError: (IntegrityError) address.user_id
may not be NULL
Here, we accessed a1.user, which returned its default value
of ``None``, which as a result of this access, has been placed in the ``__dict__`` of
our object ``a1``. Normally, this operation creates no change event,
so the ``user_id`` attribute takes precedence during a
flush. But when we merge the ``Address`` object into the session, the operation
is equivalent to::
>>> existing_a1.id = existing_a1.id
>>> existing_a1.user_id = u1.id
>>> existing_a1.user = None
Where above, both ``user_id`` and ``user`` are assigned to, and change events
are emitted for both. The ``user`` association
takes precedence, and None is applied to ``user_id``, causing a failure.
Most :meth:`~.Session.merge` issues can be examined by first checking -
is the object prematurely in the session ?
.. sourcecode:: python+sql
>>> a1 = Address(id=existing_a1, user_id=user.id)
>>> assert a1 not in session
>>> a1 = session.merge(a1)
Or is there state on the object that we don't want ? Examining ``__dict__``
is a quick way to check::
>>> a1 = Address(id=existing_a1, user_id=user.id)
>>> a1.user
>>> a1.__dict__
{'_sa_instance_state': <sqlalchemy.orm.state.InstanceState object at 0x1298d10>,
'user_id': 1,
'id': 1,
'user': None}
>>> # we don't want user=None merged, remove it
>>> del a1.user
>>> a1 = session.merge(a1)
>>> # success
>>> session.commit()
Deleting
--------
@@ -729,7 +861,7 @@ relationship between an ``Order`` and an ``Item`` object.
The ``customer`` relationship specifies only the "save-update" cascade value,
indicating most operations will not be cascaded from a parent ``Order``
instance to a child ``User`` instance except for the
:func:`~sqlalchemy.orm.session.Session.add` operation. "save-update" cascade
:func:`~sqlalchemy.orm.session.Session.add` operation. ``save-update`` cascade
indicates that an :func:`~sqlalchemy.orm.session.Session.add` on the parent
will cascade to all child items, and also that items added to a parent which
is already present in a session will also be added to that same session.
@@ -752,6 +884,45 @@ objects to allow attachment to only one parent at a time.
The default value for ``cascade`` on :func:`~sqlalchemy.orm.relationship` is
``save-update, merge``.
``save-update`` cascade also takes place on backrefs by default. This means
that, given a mapping such as this::
mapper(Order, order_table, properties={
'items' : relationship(Item, items_table, backref='order')
})
If an ``Order`` is already in the session, and is assigned to the ``order``
attribute of an ``Item``, the backref appends the ``Item`` to the ``orders``
collection of that ``Order``, resulting in the ``save-update`` cascade taking
place::
>>> o1 = Order()
>>> session.add(o1)
>>> o1 in session
True
>>> i1 = Item()
>>> i1.order = o1
>>> i1 in o1.orders
True
>>> i1 in session
True
This behavior can be disabled as of 0.6.5 using the ``cascade_backrefs`` flag::
mapper(Order, order_table, properties={
'items' : relationship(Item, items_table, backref='order',
cascade_backrefs=False)
})
So above, the assignment of ``i1.order = o1`` will append ``i1`` to the ``orders``
collection of ``o1``, but will not add ``i1`` to the session. You can of
course :func:`~.Session.add` ``i1`` to the session at a later point. This option
may be helpful for situations where an object needs to be kept out of a
session until it's construction is completed, but still needs to be given
associations to objects which are already persistent in the target session.
.. _unitofwork_transaction:
Managing Transactions
+13 -1
View File
@@ -255,7 +255,19 @@ def relationship(argument, secondary=None, **kwargs):
* ``all`` - shorthand for "save-update,merge, refresh-expire,
expunge, delete"
:param cascade_backrefs=True:
a boolean value indicating if the ``save-update`` cascade should
operate along a backref event. When set to ``False`` on a
one-to-many relationship that has a many-to-one backref, assigning
a persistent object to the many-to-one attribute on a transient object
will not add the transient to the session. Similarly, when
set to ``False`` on a many-to-one relationship that has a one-to-many
backref, appending a persistent object to the one-to-many collection
on a transient object will not add the transient to the session.
``cascade_backrefs`` is new in 0.6.5.
:param collection_class:
a class or callable that returns a new list-holding object. will
be used in place of a plain list for storing elements.
+4 -1
View File
@@ -444,8 +444,10 @@ class RelationshipProperty(StrategizedProperty):
comparator_factory=None,
single_parent=False, innerjoin=False,
doc=None,
cascade_backrefs=True,
load_on_pending=False,
strategy_class=None, _local_remote_pairs=None, query_class=None):
strategy_class=None, _local_remote_pairs=None,
query_class=None):
self.uselist = uselist
self.argument = argument
@@ -460,6 +462,7 @@ class RelationshipProperty(StrategizedProperty):
self._user_defined_foreign_keys = foreign_keys
self.collection_class = collection_class
self.passive_deletes = passive_deletes
self.cascade_backrefs = cascade_backrefs
self.passive_updates = passive_updates
self.remote_side = remote_side
self.enable_typechecks = enable_typechecks
+2
View File
@@ -1153,6 +1153,8 @@ class Session(object):
This operation cascades to associated instances if the association is
mapped with ``cascade="merge"``.
See :ref:`unitofwork_merging` for a detailed discussion of merging.
"""
if 'dont_load' in kw:
load = not kw['dont_load']
+6 -1
View File
@@ -33,10 +33,13 @@ class UOWEventHandler(interfaces.AttributeExtension):
def append(self, state, item, initiator):
# process "save_update" cascade rules for when
# an instance is appended to the list of another instance
sess = _state_session(state)
if sess:
prop = _state_mapper(state).get_property(self.key)
if prop.cascade.save_update and item not in sess:
if prop.cascade.save_update and \
(prop.cascade_backrefs or self.key == initiator.key) and \
item not in sess:
sess.add(item)
return item
@@ -55,11 +58,13 @@ class UOWEventHandler(interfaces.AttributeExtension):
# is attached to another instance
if oldvalue is newvalue:
return newvalue
sess = _state_session(state)
if sess:
prop = _state_mapper(state).get_property(self.key)
if newvalue is not None and \
prop.cascade.save_update and \
(prop.cascade_backrefs or self.key == initiator.key) and \
newvalue not in sess:
sess.add(newvalue)
if prop.cascade.delete_orphan and \
+2 -1
View File
@@ -187,17 +187,18 @@ def generate_round_trip_test(use_unions=False, use_joins=False):
pub = Publication(name='Test')
issue = Issue(issue=46,publication=pub)
location = Location(ref='ABC',name='London',issue=issue)
page_size = PageSize(name='A4',width=210,height=297)
magazine = Magazine(location=location,size=page_size)
page = ClassifiedPage(magazine=magazine,page_no=1)
page2 = MagazinePage(magazine=magazine,page_no=2)
page3 = ClassifiedPage(magazine=magazine,page_no=3)
session.add(pub)
session.flush()
print [x for x in session]
session.expunge_all()
+62 -1
View File
@@ -4,7 +4,7 @@ from sqlalchemy import Integer, String, ForeignKey, Sequence, \
exc as sa_exc
from sqlalchemy.test.schema import Table, Column
from sqlalchemy.orm import mapper, relationship, create_session, \
sessionmaker, class_mapper, backref
sessionmaker, class_mapper, backref, Session
from sqlalchemy.orm import attributes, exc as orm_exc
from sqlalchemy.test import testing
from sqlalchemy.test.testing import eq_
@@ -939,6 +939,67 @@ class M2MCascadeTest(_base.MappedTest):
assert b1 not in a1.bs
assert b1 in a2.bs
class NoBackrefCascadeTest(_fixtures.FixtureTest):
run_inserts = None
@classmethod
@testing.resolve_artifact_names
def setup_mappers(cls):
mapper(Address, addresses)
mapper(User, users, properties={
'addresses':relationship(Address, backref='user',
cascade_backrefs=False)
})
mapper(Dingaling, dingalings, properties={
'address' : relationship(Address, backref='dingalings',
cascade_backrefs=False)
})
@testing.resolve_artifact_names
def test_o2m(self):
sess = Session()
u1 = User(name='u1')
sess.add(u1)
a1 = Address(email_address='a1')
a1.user = u1
assert a1 not in sess
sess.commit()
assert a1 not in sess
sess.add(a1)
d1 = Dingaling()
d1.address = a1
assert d1 in a1.dingalings
assert d1 in sess
sess.commit()
@testing.resolve_artifact_names
def test_m2o(self):
sess = Session()
a1 = Address(email_address='a1')
d1 = Dingaling()
sess.add(d1)
a1.dingalings.append(d1)
assert a1 not in sess
a2 = Address(email_address='a2')
sess.add(a2)
u1 = User(name='u1')
u1.addresses.append(a2)
assert u1 in sess
sess.commit()
class UnsavedOrphansTest(_base.MappedTest):
"""Pending entities that are orphans"""