update selectin docs

* correct many-to-one example that doesnt use JOIN or ORDER BY anymore * Oracle does tuple IN, let's test it * many-to-many is supported but joins all the way right now * remove verbiage about yield_per for the moment to simplify updates to how yield_per works w/ new style execution. yield_per is difficult to explain and the section seems kind of complicated with those details added at the moment. Change-Id: I010ed36f554f06310f336a5b12760c447b38ec01
2026-05-15 05:07:16 -04:00 · 2020-10-31 19:08:28 -04:00
parent 78d60e108e
commit 3710382de1
4 changed files with 50 additions and 68 deletions
@@ -751,26 +751,19 @@ Select IN loading
 -----------------

 Select IN loading is similar in operation to subquery eager loading, however
-the SELECT statement which is emitted has a much simpler structure than
-that of subquery eager loading.  Additionally, select IN loading applies
-itself to subsets of the load result at a time, so unlike joined and subquery
-eager loading, is compatible with batching of results using
-:meth:`_query.Query.yield_per`, provided the database driver supports simultaneous
-cursors.
-
-Overall, especially as of the 1.3 series of SQLAlchemy, selectin loading
-is the most simple and efficient way to eagerly load collections of objects
-in most cases.  The only scenario in which selectin eager loading is not feasible
-is when the model is using composite primary keys, and the backend database
-does not support tuples with IN, which includes SQLite, Oracle and
-SQL Server.
+the SELECT statement which is emitted has a much simpler structure than that of
+subquery eager loading.  In most cases, selectin loading is the most simple and
+efficient way to eagerly load collections of objects.  The only scenario in
+which selectin eager loading is not feasible is when the model is using
+composite primary keys, and the backend database does not support tuples with
+IN, which currently includes SQL Server.

 .. versionadded:: 1.2

 "Select IN" eager loading is provided using the ``"selectin"`` argument to
 :paramref:`_orm.relationship.lazy` or by using the :func:`.selectinload` loader
 option.   This style of loading emits a SELECT that refers to the primary key
-values of the parent object, or in the case of a simple many-to-one
+values of the parent object, or in the case of a many-to-one
 relationship to the those of the child objects, inside of an IN clause, in
 order to load related associations:

@@ -793,28 +786,22 @@ order to load related associations:
        addresses.user_id AS addresses_user_id
    FROM addresses
    WHERE addresses.user_id IN (?, ?)
-    ORDER BY addresses.user_id, addresses.id
    (5, 7)

 Above, the second SELECT refers to ``addresses.user_id IN (5, 7)``, where the
 "5" and "7" are the primary key values for the previous two ``User``
 objects loaded; after a batch of objects are completely loaded, their primary
 key values are injected into the ``IN`` clause for the second SELECT.
-Because the relationship between ``User`` and ``Address`` provides that the
+Because the relationship between ``User`` and ``Address`` has a simple [1]_
+primary join condition and provides that the
 primary key values for ``User`` can be derived from ``Address.user_id``, the
 statement has no joins or subqueries at all.

 .. versionchanged:: 1.3 selectin loading can omit the JOIN for a simple
   one-to-many collection.

-.. versionchanged:: 1.3.6 selectin loading can also omit the JOIN for a simple
-   many-to-one relationship.
-
-For collections, in the case where the primary key of the parent object isn't
-present in the related row, "selectin" loading will also JOIN to the parent
-table so that the parent primary key values are present.  This also takes place
-for a non-collection, many-to-one load where the related column values are not
-loaded on the parent objects and would otherwise need to be loaded:
+For simple [1]_ many-to-one loads, a JOIN is also not needed as the foreign key
+value from the parent object is used:

 .. sourcecode:: python+sql

@@ -826,19 +813,26 @@ loaded on the parent objects and would otherwise need to be loaded:
        addresses.user_id AS addresses_user_id
        FROM addresses
    SELECT
-        addresses_1.id AS addresses_1_id,
        users.id AS users_id,
        users.name AS users_name,
        users.fullname AS users_fullname,
        users.nickname AS users_nickname
-    FROM addresses AS addresses_1
-    JOIN users ON users.id = addresses_1.user_id
-    WHERE addresses_1.id IN (?, ?)
-    ORDER BY addresses_1.id
+    FROM users
+    WHERE users.id IN (?, ?)
    (1, 2)

-"Select IN" loading is the newest form of eager loading added to SQLAlchemy
-as of the 1.2 series.   Things to know about this kind of loading include:
+.. versionchanged:: 1.3.6 selectin loading can also omit the JOIN for a simple
+   many-to-one relationship.
+
+.. [1] by "simple" we mean that the :paramref:`_orm.relationship.primaryjoin`
+   condition expresses an equality comparison between the primary key of the
+   "one" side and a straight foreign key of the "many" side, without any
+   additional criteria.
+
+Select IN loading also supports many-to-many relationships, where it currently
+will JOIN across all three tables to match rows from one side to the other.
+
+Things to know about this kind of loading include:

 * The SELECT statement emitted by the "selectin" loader strategy, unlike
  that of "subquery", does not
@@ -851,53 +845,30 @@ as of the 1.2 series.   Things to know about this kind of loading include:
  is always linking directly to a parent primary key and can't really
  return the wrong result.

-* "selectin" loading, unlike joined or subquery eager loading, always emits
-  its SELECT in terms of the immediate parent objects just loaded, and not the
+* "selectin" loading, unlike joined or subquery eager loading, always emits its
+  SELECT in terms of the immediate parent objects just loaded, and not the
  original type of object at the top of the chain.  So if eager loading many
-  levels deep, "selectin" loading still uses no more than one JOIN, and usually
-  no JOINs, in the statement.   In comparison, joined and subquery eager
-  loading always refer to multiple JOINs up to the original parent.
+  levels deep, "selectin" loading still will not require any JOINs for simple
+  one-to-many or many-to-one relationships.   In comparison, joined and
+  subquery eager loading always refer to multiple JOINs up to the original
+  parent.

-* "selectin" loading produces a SELECT statement of a predictable structure,
-  independent of that of the original query.  As such, taking advantage of
-  a new feature with :meth:`.ColumnOperators.in_` that allows it to work
-  with cached queries, the selectin loader makes full use of the
-  :mod:`sqlalchemy.ext.baked` extension to cache generated SQL and greatly
-  cut down on internal function call overhead.
-
-* The strategy will only query for at most 500 parent primary key values at a
+* The strategy emits a SELECT for up to 500 parent primary key values at a
  time, as the primary keys are rendered into a large IN expression in the
  SQL statement.   Some databases like Oracle have a hard limit on how large
  an IN expression can be, and overall the size of the SQL string shouldn't
-  be arbitrarily large.   So for large result sets, "selectin" loading
-  will emit a SELECT per 500 parent rows returned.   These SELECT statements
-  emit with minimal Python overhead due to the "baked" queries and also minimal
-  SQL overhead as they query against primary key directly.
-
-* "selectin" loading is the only eager loading that can work in conjunction with
-  the "batching" feature provided by :meth:`_query.Query.yield_per`, provided
-  the database driver supports simultaneous cursors.   As it only
-  queries for related items against specific result objects, "selectin" loading
-  allows for eagerly loaded collections against arbitrarily large result sets
-  with a top limit on memory use when used with :meth:`_query.Query.yield_per`.
-
-  Current database drivers that support simultaneous cursors include
-  SQLite, PostgreSQL.   The MySQL drivers mysqlclient and pymysql currently
-  **do not** support simultaneous cursors, nor do the ODBC drivers for
-  SQL Server.
+  be arbitrarily large.

 * As "selectin" loading relies upon IN, for a mapping with composite primary
  keys, it must use the "tuple" form of IN, which looks like ``WHERE
  (table.column_a, table.column_b) IN ((?, ?), (?, ?), (?, ?))``. This syntax
-  is not supported on every database; within the dialects that are included
-  with SQLAlchemy, it is known to be supported by modern PostgreSQL, MySQL and
-  SQLite versions.  Therefore **selectin loading is not platform-agnostic for
-  composite primary keys**. There is no special logic in SQLAlchemy to check
+  is not currently supported on SQL Server and for SQLite requires at least
+  version 3.15.  There is no special logic in SQLAlchemy to check
  ahead of time which platforms support this syntax or not; if run against a
  non-supporting platform, the database will return an error immediately.   An
  advantage to SQLAlchemy just running the SQL out for it to fail is that if a
  particular database does start supporting this syntax, it will work without
-  any changes to SQLAlchemy.
+  any changes to SQLAlchemy (as was the case with SQLite).

 In general, "selectin" loading is probably superior to "subquery" eager loading
 in most ways, save for the syntax requirement with composite primary keys
@@ -359,6 +359,11 @@ class SuiteRequirements(Requirements):

        return exclusions.closed()

+    @property
+    def tuple_in_w_empty(self):
+        """Target platform tuple IN w/ empty set"""
+        return self.tuple_in
+
    @property
    def duplicate_names_in_cursor_description(self):
        """target platform supports a SELECT statement that has
@@ -864,7 +864,7 @@ class ExpandingBoundInTest(fixtures.TablesTest):

        self._assert_result(stmt, [], params={"q": [], "p": []})

-    @testing.requires.tuple_in
+    @testing.requires.tuple_in_w_empty
    def test_empty_heterogeneous_tuples(self):
        table = self.tables.some_table

@@ -880,7 +880,7 @@ class ExpandingBoundInTest(fixtures.TablesTest):

        self._assert_result(stmt, [], params={"q": []})

-    @testing.requires.tuple_in
+    @testing.requires.tuple_in_w_empty
    def test_empty_homogeneous_tuples(self):
        table = self.tables.some_table

@@ -329,7 +329,13 @@ class DefaultRequirements(SuiteRequirements):
                config, "sqlite"
            ) and config.db.dialect.dbapi.sqlite_version_info >= (3, 15, 0)

-        return only_on(["mysql", "mariadb", "postgresql", _sqlite_tuple_in])
+        return only_on(
+            ["mysql", "mariadb", "postgresql", _sqlite_tuple_in, "oracle"]
+        )
+
+    @property
+    def tuple_in_w_empty(self):
+        return self.tuple_in + skip_if(["oracle"])

    @property
    def independent_cursors(self):