Subselects and overwriting SQL

Giovanni_Van_Geel · April 20, 2014, 7:24pm

Hi,

suppose I have a query in my original program, that has a "complex" subselect, and I want to convert that into an Aurelius entity.

An simple example:

select *,
    ( select count(*) from THealthElement
      where (THealthElement.BeginDate >= :BeginDate)
       and THealthElement.patient_id = TPatient.Patient_id) as ElementCount
  from TPatient

How would I map this ElementCount field? There is an option for sql expressions in the where clause, but there is non for fields? I want to this to be only one -trip to the database and not fire another select to fetch the ElementCount of one patient.

What I'm really missing in Aurelius is the option to overwrite the SQL (in case where you need to). That would solve the stored procedure problem as well. Aurelius should map TDatasets, no matter how they are created (simple selects, stored procedures, memory datasets, etc) and not force the autocreated SQL. I know that creating your own SQL might break compatibility between databases. Btw, DataAbstract of RemObjects fixed that by having the option to specify SQL based upon database driver.

I'm sorry to say, but this lack seems like a fundamental flaw, unless I'm missing something. In other words, is there an option to overwrite the SQL, in case you want to and if so can you specify it per database (driver).

wlandgraf · April 21, 2014, 8:22am

There is also an option for SQL expressions in projections (which can be used as select fields). But this won't result in an entity, but in specific TCriteriaResult.

I'm not sure what's the main point of overriding Aurelius SQL since the main feature of an ORM is to build the SQL for you. Note that I'm not against implementing such feature, I just don't consider it to be main priority since if you want to write your own SQL, you don't gain much by using Aurelius (what do you get?). Nevertheless, you can create views in your database server and map Aurelius to the view instead of the table, so the mentioned SQL could be a view (named TExtPatient for example) and you can map a class to that view (read-only, unless the view is updateable).

Giovanni_Van_Geel · April 21, 2014, 1:21pm

Maybe the example was a bit too simplistic. I was trying to point out that, in some cases, you want use the power of SQL, which is more powerful that what the ORM can offer you. What about with-statements in SQL Server or index hints on Oracle? The ORM completely prohibits you of using those and that might mean the difference of queries running in milliseconds or minutes. In bigger systems where datatables can be up to a couple of million records, performance is key! Having the ORM writing simple SQL statements is useful in most cases but it might be inperformant or result in complex class definitions, while having the option to write your own SQL, might simplify things considerably. Aurelius is great when it comes to the more basic crud statements, and yes, making basic selections, but once things get more complex, Aurelius just prohibits you from using the usual solution for those complex situations. A simple OnBeforeExecute eventhandler with the SQL as a var parameter and the IDbConnection as variable to know which database, might offer the solution here. Or a OnBeforeBuildSQL with a handled boolean as var parameter where you can write the complete SQL yourself.

I've heard this complaint before from other companies, not buying Aurelius because they have existing databases full with stored procedures and not willing to rewrite those. I believe an ORM's job is to map the datasets to objects. That's what the name says: object relational mapper. Writing the SQL for you, should a (default) feature but not mandatory.

I'd be happy to help you in whatever way I can, to have such a feature included, since I believe this to be crucial to develop more complex systems.

wlandgraf · April 21, 2014, 5:12pm

The events are interesting and could be implemented, yes, but as I said, this gives too much overhead to the user. Not sure if many users would use it.

Writing SQL is the main feature. How would you, for example, deal with criteria (where), joins, etc., when you have a complex query or too many associations, different lazy/eager associations? So the "raw SQL event" can be easily added from our side, but honestly, you would need to almost build your own ORM to be able to build the SQL.

I guess you are considering the ORM just as a column <-> property mapper? That's the only thing you would get (imho) with such custom SQL. And this is what people confused Aurelius with other ORM. Aurelius deal with associations, inheritance, and a small change in the way you ask for objects change the SQL dramatically.

Also, how updates are supposed to work?

Giovanni_Van_Geel · May 7, 2014, 8:23pm

I've been looking into the source code a bit and it seems it can/should be implemented on a different level. What's needed in this case is an option to provided the CommandPerformers through attributes or the MappingExplorer. If you go to the database and fetch/perform, don't build the SQL yourself, but execute this. An example with attributes could be:


[TEntity]
[Selector(TOwnSQLSelector('exec MySpecialList'))]
[Inserter(TOwnInserter('exec InsertIntoMySpecialList :MEDIA_NAME, :FILE_LOCATION, :DURATION'))]
[Updater(TOwnUpdater('exec UpdateMySpecialList :ID, :MEDIA_NAME, :FILE_LOCATION, :DURATION'))]
[Deleter(TOwnDeleter('exec DeleteMySpecialList :ID'))]
TMediaFile = class
  public
    [Column('ID', [TColumnProp.Unique, TColumnProp.Required, TColumnProp.NoUpdate])]
    FId: Integer;
    FMediaName: string;
    FFileLocation: string;
    FDuration: Nullable<integer>;    
  public
    property Id: integer read FId;


    [Column('MEDIA_NAME', [TColumnProp.Required], 100)]
    property MediaName: string read FMediaName write FMediaName;


    [Column('FILE_LOCATION', [], 300)]
    property FileLocation: string read FFileLocation write FFileLocation;


    [Column('DURATION', [])]
    property Duration: Nullable<integer> read FDuration write FDuration;    
  end;

Even a better way would be, if you could specify the database driver's name. Thus for SQL Server, execute this SQL, but for FireBird, execute this. A better option, imho. is to have this specified in code, through the MappingSetup instead of attributes, because it would get cluttered otherwise. Something like this might do:


procedure TDataModule1.CreatePerformers(ASetup: TMappingSetup);
var MediaFilePerformers: TDictionary<string, TPerformerSet>;
  SQLServerPerformers: TPerformerSet;
  FireBirdPerformers: TPerformerSet;
begin
  MediaFilePerformers := ASetup.Performers[TMediaFile];


  SQLServerPerformers := TPerformerSet.Create('SQLServer');
  SQLServerPerformers.Add(pcSelect,'exec MySpecialList');
  SQLServerPerformers.Add(pcInsert,'exec InsertIntoMySpecialList :MEDIA_NAME, :FILE_LOCATION, :DURATION');
  SQLServerPerformers.Add(pcUpdate,'exec UpdateMySpecialList :ID, :MEDIA_NAME, :FILE_LOCATION, :DURATION');
  SQLServerPerformers.Add(pcDelete,'exec DeleteMySpecialList :ID');


  SQLServerPerformers := TPerformerSet.Create('FireBird');
  SQLServerPerformers.Add(pcSelect,'execute procedure MySpecialList');
  SQLServerPerformers.Add(pcInsert,'execute procedure InsertIntoMySpecialList :MEDIA_NAME, :FILE_LOCATION, :DURATION');
  SQLServerPerformers.Add(pcUpdate,'execute procedure UpdateMySpecialList :ID, :MEDIA_NAME, :FILE_LOCATION, :DURATION');
  SQLServerPerformers.Add(pcDelete,'execute procedure DeleteMySpecialList :ID');  


  MediaFilePerformers.Add('SQLServer',SQLServerPerformers);
  MediaFilePerformers.Add('FireBird', FireBirdPerformers);  	  


  TMappingExplorer.ReplaceDefaultInstance(TMappingExplorer.Create(ASetup));
  ...
end;

That opens up the possibility to have execution of stored procedures on a class and have different executions per database type. But that also removes the option(s) to do search criteria on a SQL level or the option of foreign keys on this entity. Thus it comes with consequences. You can't use this class for criteria searches or join it with another entity (since Aurelius joins on SQL level, not on object level). Maybe it might be better not to call it an entity, but a (let's say) derived-entity.

Consider this: suppose you work in a bank and the DBA denies you direct access to the tables, because he wants to audit every interaction with the database. But you can fetch or manipulate data through stored procedures or views (And remeber that not all views are updatable). That setup makes Aurelius unusable.

Suppose the generated SQL performs badly and rewriting the SQL in another way increases performace by x%. This is an everyday situation of a developer. Denying access to the SQL, if needed, doesn't sound like a key feature at that point, but more like a nuisance.

The key feature of an ORM is to convert datasets to objectlists and back. How you get to those datasets, should be irrelevant. Whether the dataset comes from a simple select on one table, a select with a join on multiple tables or a stored procedure, should not matter. Your approach limits the programmer by denying control over the SQL. It's OK to have sql generation, because in most cases, it is trivial, but there are cases where this is not an option, like database privileges and performance. I understand you see SQL generation as a key feature, true, but it should not be obligated. By allowing SQL to be rewrittin, you open up the flexibility of the ORM.

wlandgraf · May 8, 2014, 6:48am

This is something to be thought carefully. When I say SQL is key feature, what I mean is it brings a lot of good stuff to Aurelius (which some people don't see, because they consider ORM is just a property->field mapping). That's the case here. This "transforms" Aurelius into a simple property->field mapping, which honestly could be implemented in a day with the new RTTI. The strongness of Aurelius imho is exactly the fact it allows real and complex object-oriented programming with persistence, by this meaning inheritance, querying, etc.. Just doing a simple "Name := FieldByName('Name').AsString" is not too much.

Well, I'm not discarding this option at all, but it's something complex that requires more thinking and designing.

White_Dave · May 19, 2014, 10:48pm

I'm with Giovanni on this. I've come up against a number of instances in which the SQL generated by Aurelius is seriously inefficient - an example from yesterday resulted in an Aurelius query taking 1.5 seconds, but the hand optimized SQL, creating the joins in the reverse order from the detail records back to the master, which is what the search requires, took only 0.01seconds - that's 150x faster.These aren't complex queries, it just matters a lot where the query starts when you have linked tables. Because of cases like this, I've come close to ripping Aurelius out and going back to regular datasets - which would be a shame because with a few tweaks, Aurelius could be a fantastic system.

wlandgraf · May 20, 2014, 10:28am

Are you willing to write custom SQL yourself for specific situations to improve performance, or you just want Aurelius improvement in performance? Those are different things.

We are going soon to focus on Aurelius performance improvements, and I would really appreciate if you could provide more details about this specific situation (which SQL was generated by Aurelius, what is the mapping the caused this, what is the SQL that performed better, even better if there is a project with the problem so we could debug). We will focus on optimizing them as well.

Also please note as with every tool, you must sometimes learn deeply about it to use it in a way that it performs better. Aren't there maybe alternatives of doing the things you are doing, using Aurelius itself, that could improve performance? In some situations, there isn't a single way to do things. Anyway, if you could provide more info, I will make sure we will look into it and improve it.

White_Dave · May 21, 2014, 12:11pm

I'd be perfectly happy writing custom SQL myself - I think in many of these situations, because of the generic nature of Aurelius, it would be the only way to go. Improvements in Aurelius itself would also be welcomed.
I'll put together some brief notes on the data structure and generated SQL, etc. Because of work pressures right now with the holiday weekend coming up, it may be a day or two before I can get it to you. I'll send it by private mail.
I appreciate that there may be ways that I'm using Aurelius that aren't optimal - I'm still learning there. Although I think in this case, because the entity type for returned lists always needs to be the initial item in the SQL, it's not always possible to generate efficient SQL from Aurelius. If we could start the select from a child object and work back to the entity target object, this could result in many many less record scans that always having to start at the entity object. Anyway, I'll send you detailed info in the next day or so that will make this more clear.
Thanks for looking into it.

wlandgraf · May 21, 2014, 12:16pm

Ok, I'm looking forward to your feedback. Two comments so far:

1) How do you expect telling Aurelius which fields belong to each property/class when using custom SQL? For example, suppose you are writing a custom SQL to retrieve an Invoice and its related customer, how would Aurelius know which fields belong to invoice, which belong to customer, and what would be their names?

2) If you have bidirectional associations (TInvoice.Items and TInvoiceItem.Invoice) you could start a query from TInvoiceItem and go to the TInvoice object through that association to perform filters and/or retrieving additional TInvoice information