Merge "Add new sections regarding schemas and reflection" into rel_1_4

This commit is contained in:
mike bayer
2021-11-18 16:14:44 +00:00
committed by Gerrit Code Review
7 changed files with 284 additions and 42 deletions
+1 -1
View File
@@ -1048,7 +1048,7 @@ localized to the current VALUES clause being processed::
def mydefault(context):
return context.get_current_parameters()['counter'] + 12
mytable = Table('mytable', meta,
mytable = Table('mytable', metadata_obj,
Column('counter', Integer),
Column('counter_plus_twelve',
Integer, default=mydefault, onupdate=mydefault)
+37 -8
View File
@@ -284,11 +284,11 @@ remote servers (Oracle DBLINK with synonyms).
What all of the above approaches have (mostly) in common is that there's a way
of referring to this alternate set of tables using a string name. SQLAlchemy
refers to this name as the **schema name**. Within SQLAlchemy, this is nothing more than
a string name which is associated with a :class:`_schema.Table` object, and
is then rendered into SQL statements in a manner appropriate to the target
database such that the table is referred towards in its remote "schema", whatever
mechanism that is on the target database.
refers to this name as the **schema name**. Within SQLAlchemy, this is nothing
more than a string name which is associated with a :class:`_schema.Table`
object, and is then rendered into SQL statements in a manner appropriate to the
target database such that the table is referred towards in its remote "schema",
whatever mechanism that is on the target database.
The "schema" name may be associated directly with a :class:`_schema.Table`
using the :paramref:`_schema.Table.schema` argument; when using the ORM
@@ -298,11 +298,27 @@ the parameter is passed using the ``__table_args__`` parameter dictionary.
The "schema" name may also be associated with the :class:`_schema.MetaData`
object where it will take effect automatically for all :class:`_schema.Table`
objects associated with that :class:`_schema.MetaData` that don't otherwise
specify their own name. Finally, SQLAlchemy also supports a "dynamic" schema name
specify their own name. Finally, SQLAlchemy also supports a "dynamic" schema name
system that is often used for multi-tenant applications such that a single set
of :class:`_schema.Table` metadata may refer to a dynamically configured set of
schema names on a per-connection or per-statement basis.
.. topic:: What's "schema" ?
SQLAlchemy's support for database "schema" was designed with first party
support for PostgreSQL-style schemas. In this style, there is first a
"database" that typically has a single "owner". Within this database there
can be any number of "schemas" which then contain the actual table objects.
A table within a specific schema is referred towards explicitly using the
syntax "<schemaname>.<tablename>". Constrast this to an architecture such
as that of MySQL, where there are only "databases", however SQL statements
can refer to multiple databases at once, using the same syntax except it
is "<database>.<tablename>". On Oracle, this syntax refers to yet another
concept, the "owner" of a table. Regardless of which kind of database is
in use, SQLAlchemy uses the phrase "schema" to refer to the qualifying
identifier within the general syntax of "<qualifier>.<tablename>".
.. seealso::
:ref:`orm_declarative_table_schema_name` - schema name specification when using the ORM
@@ -368,6 +384,8 @@ at once, such as::
:ref:`multipart_schema_names` - describes use of dotted schema names
with the SQL Server dialect.
:ref:`schema_table_reflection`
.. _schema_metadata_schema_name:
@@ -438,11 +456,11 @@ to specify that it should not be schema qualified may use the special symbol
schema=BLANK_SCHEMA # will not use "remote_banks"
)
.. seealso::
:paramref:`_schema.MetaData.schema`
.. _schema_dynamic_naming_convention:
Applying Dynamic Schema Naming Conventions
@@ -454,11 +472,11 @@ basis, so that for example in multi-tenant situations, each transaction
or statement may be targeted at a specific set of schema names that change.
The section :ref:`schema_translating` describes how this feature is used.
.. seealso::
:ref:`schema_translating`
.. _schema_set_default_connections:
Setting a Default Schema for New Connections
@@ -506,6 +524,17 @@ for specific information regarding how default schemas are configured.
:ref:`postgresql_alternate_search_path` - in the :ref:`postgresql_toplevel` dialect documentation.
Schemas and Reflection
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The schema feature of SQLAlchemy interacts with the table reflection
feature introduced at ref:`metadata_reflection_toplevel`. See the section
:ref:`metadata_reflection_schemas` for additional details on how this works.
Backend-Specific Options
------------------------
+218 -5
View File
@@ -13,7 +13,7 @@ existing within the database. This process is called *reflection*. In the
most simple case you need only specify the table name, a :class:`~sqlalchemy.schema.MetaData`
object, and the ``autoload_with`` argument::
>>> messages = Table('messages', meta, autoload_with=engine)
>>> messages = Table('messages', metadata_obj, autoload_with=engine)
>>> [c.name for c in messages.columns]
['message_id', 'message_name', 'date']
@@ -30,8 +30,8 @@ Below, assume the table ``shopping_cart_items`` references a table named
``shopping_carts``. Reflecting the ``shopping_cart_items`` table has the
effect such that the ``shopping_carts`` table will also be loaded::
>>> shopping_cart_items = Table('shopping_cart_items', meta, autoload_with=engine)
>>> 'shopping_carts' in meta.tables:
>>> shopping_cart_items = Table('shopping_cart_items', metadata_obj, autoload_with=engine)
>>> 'shopping_carts' in metadata_obj.tables:
True
The :class:`~sqlalchemy.schema.MetaData` has an interesting "singleton-like"
@@ -43,7 +43,7 @@ you the already-existing :class:`~sqlalchemy.schema.Table` object if one
already exists with the given name. Such as below, we can access the already
generated ``shopping_carts`` table just by naming it::
shopping_carts = Table('shopping_carts', meta)
shopping_carts = Table('shopping_carts', metadata_obj)
Of course, it's a good idea to use ``autoload_with=engine`` with the above table
regardless. This is so that the table's attributes will be loaded if they have
@@ -61,7 +61,7 @@ Individual columns can be overridden with explicit values when reflecting
tables; this is handy for specifying custom datatypes, constraints such as
primary keys that may not be configured within the database, etc.::
>>> mytable = Table('mytable', meta,
>>> mytable = Table('mytable', metadata_obj,
... Column('id', Integer, primary_key=True), # override reflected 'id' to have primary key
... Column('mydata', Unicode(50)), # override reflected 'mydata' to be Unicode
... # additional Column objects which require no change are reflected normally
@@ -119,6 +119,219 @@ object's dictionary of tables::
for table in reversed(metadata_obj.sorted_tables):
someengine.execute(table.delete())
.. _metadata_reflection_schemas:
Reflecting Tables from Other Schemas
------------------------------------
The section :ref:`schema_table_schema_name` introduces the concept of table
schemas, which are namespaces within a database that contain tables and other
objects, and which can be specified explicitly. The "schema" for a
:class:`_schema.Table` object, as well as for other objects like views, indexes and
sequences, can be set up using the :paramref:`_schema.Table.schema` parameter,
and also as the default schema for a :class:`_schema.MetaData` object using the
:paramref:`_schema.MetaData.schema` parameter.
The use of this schema parameter directly affects where the table reflection
feature will look when it is asked to reflect objects. For example, given
a :class:`_schema.MetaData` object configured with a default schema name
"project" via its :paramref:`_schema.MetaData.schema` parameter::
>>> metadata_obj = MetaData(schema="project")
The :method:`.MetaData.reflect` will then utilize that configured ``.schema``
for reflection::
>>> # uses `schema` configured in metadata_obj
>>> metadata_obj.reflect(someengine)
The end result is that :class:`_schema.Table` objects from the "project"
schema will be reflected, and they will be populated as schema-qualified
with that name::
>>> metadata_obj.tables['project.messages']
Table('messages', MetaData(), Column('message_id', INTEGER(), table=<messages>), schema='project')
Similarly, an individual :class:`_schema.Table` object that includes the
:paramref:`_schema.Table.schema` parameter will also be reflected from that
database schema, overriding any default schema that may have been configured on the
owning :class:`_schema.MetaData` collection::
>>> messages = Table('messages', metadata_obj, schema="project", autoload_with=someengine)
>>> messages
Table('messages', MetaData(), Column('message_id', INTEGER(), table=<messages>), schema='project')
Finally, the :meth:`_schema.MetaData.reflect` method itself also allows a
:paramref:`_schema.MetaData.reflect.schema` parameter to be passed, so we
could also load tables from the "project" schema for a default configured
:class:`_schema.MetaData` object::
>>> metadata_obj = MetaData()
>>> metadata_obj.reflect(someengine, schema="project")
We can call :meth:`_schema.MetaData.reflect` any number of times with different
:paramref:`_schema.MetaData.schema` arguments (or none at all) to continue
populating the :class:`_schema.MetaData` object with more objects::
>>> # add tables from the "customer" schema
>>> metadata_obj.reflect(someengine, schema="customer")
>>> # add tables from the default schema
>>> metadata_obj.reflect(someengine)
.. _reflection_schema_qualified_interaction:
Interaction of Schema-qualified Reflection with the Default Schema
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. admonition:: Section Best Practices Summarized
In this section, we discuss SQLAlchemy's reflection behavior regarding
tables that are visible in the "default schema" of a database session,
and how these interact with SQLAlchemy directives that include the schema
explicitly. As a best practice, ensure the "default" schema for a database
is just a single name, and not a list of names; for tables that are
part of this "default" schema and can be named without schema qualification
in DDL and SQL, leave corresponding :paramref:`_schema.Table.schema` and
similar schema parameters set to their default of ``None``.
As described at :ref:`schema_metadata_schema_name`, databases that have
the concept of schemas usually also include the concept of a "default" schema.
The reason for this is naturally that when one refers to table objects without
a schema as is common, a schema-capable database will still consider that
table to be in a "schema" somewhere. Some databases such as PostgreSQL
take this concept further into the notion of a
`schema search path
<https://www.postgresql.org/docs/current/static/ddl-schemas.html#DDL-SCHEMAS-PATH>`_
where *multiple* schema names can be considered in a particular database
session to be "implicit"; referring to a table name that it's any of those
schemas will not require that the schema name be present (while at the same time
it's also perfectly fine if the schema name *is* present).
Since most relational databases therefore have the concept of a particular
table object which can be referred towards both in a schema-qualified way, as
well as an "implicit" way where no schema is present, this presents a
complexity for SQLAlchemy's reflection
feature. Reflecting a table in
a schema-qualified manner will always populate its :attr:`_schema.Table.schema`
attribute and additionally affect how this :class:`_schema.Table` is organized
into the :attr:`_schema.MetaData.tables` collection, that is, in a schema
qualified manner. Conversely, reflecting the **same** table in a non-schema
qualified manner will organize it into the :attr:`_schema.MetaData.tables`
collection **without** being schema qualified. The end result is that there
would be two separate :class:`_schema.Table` objects in the single
:class:`_schema.MetaData` collection representing the same table in the
actual database.
To illustrate the ramifications of this issue, consider tables from the
"project" schema in the previous example, and suppose also that the "project"
schema is the default schema of our database connection, or if using a database
such as PostgreSQL suppose the "project" schema is set up in the PostgreSQL
``search_path``. This would mean that the database accepts the following
two SQL statements as equivalent::
-- schema qualified
SELECT message_id FROM project.messages
-- non-schema qualified
SELECT message_id FROM messages
This is not a problem as the table can be found in both ways. However
in SQLAlchemy, it's the **identity** of the :class:`_schema.Table` object
that determines its semantic role within a SQL statement. Based on the current
decisions within SQLAlchemy, this means that if we reflect the same "messages" table in
both a schema-qualified as well as a non-schema qualified manner, we get
**two** :class:`_schema.Table` objects that will **not** be treated as
semantically equivalent::
>>> # reflect in non-schema qualified fashion
>>> messages_table_1 = Table("messages", metadata_obj, autoload_with=someengine)
>>> # reflect in schema qualified fashion
>>> messages_table_2 = Table("messages", metadata_obj, schema="project", autoload_with=someengine)
>>> # two different objects
>>> messages_table_1 is messages_table_2
False
>>> # stored in two different ways
>>> metadata.tables["messages"] is messages_table_1
True
>>> metadata.tables["project.messages"] is messages_table_2
True
The above issue becomes more complicated when the tables being reflected contain
foreign key references to other tables. Suppose "messages" has a "project_id"
column which refers to rows in another schema-local table "projects", meaning
there is a :class:`_schema.ForeignKeyConstraint` object that is part of the
definition of the "messages" table.
We can find ourselves in a situation where one :class:`_schema.MetaData`
collection may contain as many as four :class:`_schema.Table` objects
representing these two database tables, where one or two of the additional
tables were generated by the reflection process; this is because when
the reflection process encounters a foreign key constraint on a table
being reflected, it branches out to reflect that referenced table as well.
The decision making it uses to assign the schema to this referenced
table is that SQLAlchemy will **omit a default schema** from the reflected
:class:`_schema.ForeignKeyConstraint` object if the owning
:class:`_schema.Table` also omits its schema name and also that these two objects
are in the same schema, but will **include** it if
it were not omitted.
The common scenario is when the reflection of a table in a schema qualified
fashion then loads a related table that will also be performed in a schema
qualified fashion::
>>> # reflect "messages" in a schema qualified fashion
>>> messages_table_1 = Table("messages", metadata_obj, schema="project", autoload_with=someengine)
The above ``messages_table_1`` will refer to ``projects`` also in a schema
qualified fashion. This "projects" table will be reflected automatically by
the fact that "messages" refers to it::
>>> messages_table_1.c.project_id
Column('project_id', INTEGER(), ForeignKey('project.projects.project_id'), table=<messages>)
if some other part of the code reflects "projects" in a non-schema qualified
fashion, there are now two projects tables that are not the same:
>>> # reflect "projects" in a non-schema qualified fashion
>>> projects_table_1 = Table("projects", metadata_obj, autoload_with=someengine)
>>> # messages does not refer to projects_table_1 above
>>> messages_table_1.c.project_id.references(projects_table_1.c.project_id)
False
>>> it refers to this one
>>> projects_table_2 = metadata_obj.tables["project.projects"]
>>> messages_table_1.c.project_id.references(projects_table_2.c.project_id)
True
>>> they're different, as one non-schema qualified and the other one is
>>> projects_table_1 is projects_table_2
False
The above confusion can cause problems within applications that use table
reflection to load up application-level :class:`_schema.Table` objects, as
well as within migration scenarios, in particular such as when using Alembic
Migrations to detect new tables and foreign key constraints.
The above behavior can be remedied by sticking to one simple practice:
* Don't include the :paramref:`_schema.Table.schema` parameter for any
:class:`_schema.Table` that expects to be located in the **default** schema
of the database.
For PostgreSQL and other databases that support a "search" path for schemas,
add the following additional practice:
* Keep the "search path" narrowed down to **one schema only, which is the
default schema**.
.. seealso::
:ref:`postgresql_schema_reflection` - additional details of this behavior
as regards the PostgreSQL database.
.. _metadata_reflection_inspector:
Fine Grained Reflection with Inspector
+1 -1
View File
@@ -232,7 +232,7 @@ such as `collation` and `charset`::
from sqlalchemy.dialects.mysql import VARCHAR, TEXT
table = Table('foo', meta,
table = Table('foo', metadata_obj,
Column('col1', VARCHAR(200, collation='binary')),
Column('col2', TEXT(charset='latin1'))
)
+24 -24
View File
@@ -273,20 +273,22 @@ be reverted when the DBAPI connection has a rollback.
Remote-Schema Table Introspection and PostgreSQL search_path
------------------------------------------------------------
**TL;DR;**: keep the ``search_path`` variable set to its default of ``public``,
name schemas **other** than ``public`` explicitly within ``Table`` definitions.
.. admonition:: Section Best Practices Summarized
The PostgreSQL dialect can reflect tables from any schema. The
:paramref:`_schema.Table.schema` argument, or alternatively the
:paramref:`.MetaData.reflect.schema` argument determines which schema will
be searched for the table or tables. The reflected :class:`_schema.Table`
objects
will in all cases retain this ``.schema`` attribute as was specified.
However, with regards to tables which these :class:`_schema.Table`
objects refer to
via foreign key constraint, a decision must be made as to how the ``.schema``
is represented in those remote tables, in the case where that remote
schema name is also a member of the current
keep the ``search_path`` variable set to its default of ``public``, without
any other schema names. For other schema names, name these explicitly
within :class:`_schema.Table` definitions. Alternatively, the
``postgresql_ignore_search_path`` option will cause all reflected
:class:`_schema.Table` objects to have a :attr:`_schema.Table.schema`
attribute set up.
The PostgreSQL dialect can reflect tables from any schema, as outlined in
:ref:`schema_table_reflection`.
With regards to tables which these :class:`_schema.Table`
objects refer to via foreign key constraint, a decision must be made as to how
the ``.schema`` is represented in those remote tables, in the case where that
remote schema name is also a member of the current
`PostgreSQL search path
<https://www.postgresql.org/docs/current/static/ddl-schemas.html#DDL-SCHEMAS-PATH>`_.
@@ -349,8 +351,8 @@ reflection process as follows::
>>> engine = create_engine("postgresql://scott:tiger@localhost/test")
>>> with engine.connect() as conn:
... conn.execute(text("SET search_path TO test_schema, public"))
... meta = MetaData()
... referring = Table('referring', meta,
... metadata_obj = MetaData()
... referring = Table('referring', metadata_obj,
... autoload_with=conn)
...
<sqlalchemy.engine.result.CursorResult object at 0x101612ed0>
@@ -359,7 +361,7 @@ The above process would deliver to the :attr:`_schema.MetaData.tables`
collection
``referred`` table named **without** the schema::
>>> meta.tables['referred'].schema is None
>>> metadata_obj.tables['referred'].schema is None
True
To alter the behavior of reflection such that the referred schema is
@@ -370,8 +372,8 @@ dialect-specific argument to both :class:`_schema.Table` as well as
>>> with engine.connect() as conn:
... conn.execute(text("SET search_path TO test_schema, public"))
... meta = MetaData()
... referring = Table('referring', meta,
... metadata_obj = MetaData()
... referring = Table('referring', metadata_obj,
... autoload_with=conn,
... postgresql_ignore_search_path=True)
...
@@ -379,7 +381,7 @@ dialect-specific argument to both :class:`_schema.Table` as well as
We will now have ``test_schema.referred`` stored as schema-qualified::
>>> meta.tables['test_schema.referred'].schema
>>> metadata_obj.tables['test_schema.referred'].schema
'test_schema'
.. sidebar:: Best Practices for PostgreSQL Schema reflection
@@ -401,13 +403,11 @@ installation, this is the name ``public``. So a table that refers to another
which is in the ``public`` (i.e. default) schema will always have the
``.schema`` attribute set to ``None``.
.. versionadded:: 0.9.2 Added the ``postgresql_ignore_search_path``
dialect-level option accepted by :class:`_schema.Table` and
:meth:`_schema.MetaData.reflect`.
.. seealso::
:ref:`reflection_schema_qualified_interaction` - discussion of the issue
from a backend-agnostic perspective
`The Schema Search Path
<https://www.postgresql.org/docs/9.0/static/ddl-schemas.html#DDL-SCHEMAS-PATH>`_
- on the PostgreSQL website.
+2 -2
View File
@@ -4897,7 +4897,7 @@ class Computed(FetchedValue, SchemaItem):
from sqlalchemy import Computed
Table('square', meta,
Table('square', metadata_obj,
Column('side', Float, nullable=False),
Column('area', Float, Computed('side * side'))
)
@@ -4994,7 +4994,7 @@ class Identity(IdentityOptions, FetchedValue, SchemaItem):
from sqlalchemy import Identity
Table('foo', meta,
Table('foo', metadata_obj,
Column('id', Integer, Identity())
Column('description', Text),
)
+1 -1
View File
@@ -843,7 +843,7 @@ class UserDefinedType(util.with_metaclass(VisitableCheckKWArg, TypeEngine)):
Once the type is made, it's immediately usable::
table = Table('foo', meta,
table = Table('foo', metadata_obj,
Column('id', Integer, primary_key=True),
Column('data', MyType(16))
)