feat!: EXPOSED-320 Many-to-many relation with extra columns #2204

bog-walk · 2024-08-16T02:20:57Z

Description

Summary of the change:
Provides the functionality for DAO entities to set/get additional data to/from extra columns (non-referencing) in the intermediate table of a many-to-many relation.

Detailed description:

Why:
Additional columns (beyond the 2 reference columns) are possible on intermediate tables defined in many-to-many relations. But these columns cannot be accessed by the referencing DAO entities involved because via() creates an InnerTableLink class that ignores these columns. Workarounds include adding a 'fake' id primary key to the intermediate table and creating an associated entity, or duplicating inner logic with custom classes. The latter only partially works, however, because internal cache logic requires that the target of via() is an Entity.
What:
- This PR allows via() to be used in the same way as before by introducing a new entity class InnerTableLinkEntity that wraps the referenced entity (and delegates to it's id and table) along with any additional data in the intermediate table row.
- Additional columns are no longer ignored by default and values are required to be used in the generated SQL whenever a new reference collection is set.
- Additional column data can be accessed from either (or both) the source or target entity.
How:
- Add abstract InnerTableLinkEntity, with associated InnerTableLinkEntityClass, that forces 2 overrides so users define how to properly set and get any additional data. Subclass can be a class or data class as needed.
- via() and InnerTableLink now accept a list of columns as arguments to specify which additional columns should be included. This allows users to opt-out of new functionality, by providing emptyList(), if it breaks any current workaround.
- InnerTableLink internal logic uses additional columns to generate triggered delete and insert SQL. Previously, statements would only be generated if there was a change in the reference id. Now they will be generated even if the reference id is identical, as long as any of the additional data changes. This is necessary to allow reference collections to be updated properly.
- Add new internal EntityCache map just for InnerTableLinkEntity. Because the latter delegates to the wrapped entity, if the regular data cache is used, this special entity may override its cached wrapped entity or be incorrectly retrieved on find(). This map stores each entity by its target column and source id, as the intermediate table is expected to have a contract of uniqueness on the 2 reference columns.

Note: The original plan was to have this implementation alongside a more standard approach, which would get/set the additional data as a delegate field on an existing entity, for example:

class Project(id: EntityID<Int>) : IntEntity(id) {
    // ...
    var tasks by Task via ProjectTasks
}
class Task(id: EntityID<Int>) : IntEntity(id) {
    // ...
    var approved by ProjectTasks.approved
}

This worked well for safe setting and getting, but started raising questions when updates were introduced:

Should it be possible to set the field approved in isolation? Meaning not as part of SizedCollection, but in new {} or through a standard task.approved = true?
Would setting the field in isolation be considered an update and should it then trigger a cascade by causing the reference field to also update? And if so, how?
If a Task was already cached with its approved field set by references loaded from the intermediate table, then something like Task.all() was loaded, the new task would override the cached task and trying to access the approved field would cause an exception as it would rightly not have any value. Would it be expected that this shouldn't happen?

So I opted to go for the safer implementation and if users come forward requesting the design above, I'm hopeful that their use cases will answer some of these questions.

Type of Change

Please mark the relevant options with an "X":

New feature

Updates/remove existing public API methods:

Is breaking change

Affected databases:

All

Checklist

Unit tests are in place
The build is green (including the Detekt check)
All public methods affected by my PR has up to date API docs
Documentation for my change is up to date

Related Issues

EXPOSED-320, EXPOSED-443

An intermediate table is defined to link a many-to-many relation between 2 IdTables, with references defined using via(). If this intermediate table is defined with additional columns, these are not accessible through the linked entities. This PR refactors the InnerTableLink logic to include additional columns in generated SQL, which can be accessed as a regular field on one of the referencing entities. It also allows the possibility to access the additional data as a new entity type, which wraps the main child entity along with the additional fields. This is accomplished with the introduction of InnerTableLinkEntity.

- Remove approach to set/get additional data from an existing entity object field. This requires some UX concerns answered, for example, concerning caching. Would the wrapped entity (if loaded from a query of its own table) override the entity+data loaded from the many-to-many query? Would updating the field mean the reference should also be trigger a delete+insert? - Fix issue with updating and caching new additional data

- Fix detekt issue

- Add more tests (particularly for update) & rename test classes - Refactor cache to ensure no overlap with wrapped type. Each link entity is now stored by its target column, source column, and target id (stored in entity) - Move new entity classes to own file - Refactor logic for deleting cached entities

- Fix KDocs samples

bog-walk · 2024-09-20T03:10:24Z

exposed-dao/src/main/kotlin/org/jetbrains/exposed/dao/InnerTableLinkEntity.kt

+     */
+    abstract fun getInnerTableLinkValue(column: Column<*>): Any?


Even though an entity class is being used to accomplish this feature, the original behavior and usage of via() entities should most likely be the same. I think it would be best to override all standard entity functions, like delete(), all(), new(), findById(), etc since their use wouldn't work (since the intermediate table does not have to be an IdTable).

This would mean that attempting to call them on an InnerTableLinkEntity throws an error instead. So the only way to insert or delete or retrieve these linked values would be by setting/getting the parent/child entity field declared with via(). Basically these entities would only exist or be used through the via field. Does that make sense?

bog-walk · 2024-09-20T03:21:07Z

exposed-dao/src/main/kotlin/org/jetbrains/exposed/dao/EntityCache.kt

+    internal val innerTableLinks by lazy {
+        HashMap<Column<*>, MutableMap<EntityID<*>, MutableSet<InnerTableLinkEntity<*>>>>()
+    }


Most ideally, there should be a cache that stores the link entities without any relation to their referenced counterparts, solely based on some special id, which could be retrieved from the ResultRow in wrapLinkRow(). This is how the data cache is for example set up for all regular entities.
This would mean either a brand new id for the entity (defeats the purpose as the point is to not introduce a new/fake id column in the intermediate table, since uniqueness is based on the 2 referencing columns) or some way to check ResultRow values against entity values. For the latter, I did consider forcing another override where the user defines some sort of equality match between ResultRow and InnerTableLinkEntity, but it got a bit messy.

What the above cache does is store all InnerTableLinkEntitys for a target column and source (column) id, so uniqueness essentially relies on 3 values: target column (e.g. task in ProjectTasks), source id (e.g. project value in ProjectTasks), and target id (e.g. TaskWithData.id stored in the entity itself).

bog-walk · 2024-09-20T03:29:09Z

@obabichevjb I refactored this PR (and added more tests) as original cache was failing if the same wrapped entity was used with different additional data (for example, updating TaskWithData(Task(11), true, 1) to TaskWithData(Task(11), false, 1) would not trigger the cache to update). Now the cache stores these special entities based on target column, source id, and the target (wrapped) id.

Please let me know if any API improvements could be considered, and what you think about overriding the standard entity functions to throw an error (like new() etc) so that the entity isn't accidentally used like a standard entity.

bog-walk linked an issue Aug 16, 2024 that may be closed by this pull request

How to use additional columns by Parent-Child reference? #600

Open

bog-walk requested review from e5l and joc-a August 16, 2024 02:21

This was linked to issues Aug 16, 2024

Many-to-Many mapping with attributes #1163

Open

Many-To-Many with extra column #928

Open

Extra columns in bridging table #667

Open

e5l approved these changes Aug 19, 2024

View reviewed changes

bog-walk marked this pull request as draft August 26, 2024 03:15

bog-walk force-pushed the bog-walk/fix-many-to-many-composite branch from 19d5e3f to 91372bb Compare August 27, 2024 02:26

bog-walk marked this pull request as ready for review September 20, 2024 02:49

bog-walk added 5 commits September 19, 2024 22:50

feat!: EXPOSED-320 Many-to-many relation with extra columns

8adc0aa

- Fix detekt issue

feat!: EXPOSED-320 Many-to-many relation with extra columns

0562b39

- Fix KDocs samples

bog-walk force-pushed the bog-walk/fix-many-to-many-composite branch from 91372bb to 0562b39 Compare September 20, 2024 02:50

bog-walk commented Sep 20, 2024

View reviewed changes

bog-walk requested a review from obabichevjb September 20, 2024 03:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat!: EXPOSED-320 Many-to-many relation with extra columns #2204

feat!: EXPOSED-320 Many-to-many relation with extra columns #2204

bog-walk commented Aug 16, 2024 •

edited

Loading

bog-walk Sep 20, 2024

bog-walk Sep 20, 2024

bog-walk commented Sep 20, 2024

		*/
		abstract fun getInnerTableLinkValue(column: Column<*>): Any?

feat!: EXPOSED-320 Many-to-many relation with extra columns #2204

Are you sure you want to change the base?

feat!: EXPOSED-320 Many-to-many relation with extra columns #2204

Conversation

bog-walk commented Aug 16, 2024 • edited Loading

Description

Type of Change

Checklist

Related Issues

bog-walk Sep 20, 2024

Choose a reason for hiding this comment

bog-walk Sep 20, 2024

Choose a reason for hiding this comment

bog-walk commented Sep 20, 2024

bog-walk commented Aug 16, 2024 •

edited

Loading