Saturday, February 24, 2007

A comparison of Java's Hibernate and .NET 2.0 Strongly-typed DataSets

After a thorough study of Hibernate in Action (and completing my SCBCD [EJB 2.0] certification) in fall 2006, followed by the creation of a Hibernate demo project on my website, I now find myself working on a .NET 2.0 project, using strongly-typed DataSets for the object-relational mapping. I thought it would be useful to compare these two technologies. Hence this entry:

Hibernate and .NET 2.0 strongly-typed DataSets do share some features in common:
  1. An xml document (the DataSet) is used to determine the mapping rules. (It is also convenient--although much more data-driven than object-model driven that the mapping xml can be created in a designer mode by dragging tables onto the screen and linking relations between them.)
  2. This xml document also automatically generates all of the objects, as well as a separate sub-namespace (think Java package) containing the repository classes for populating the objects.
  3. The objects, object collections, and associations are all mapped to the database.
  4. The objects are persisted to and from the database using repository classes (known as table adapters in .NET DataSets.) These classes are generated in a separate sub-namespace (package).
But there are signifigant differences / limitations:
  1. The DataSet model is a very strongly data-driven model. This makes it difficult to develop a true object model as you can in Hibernate. As a result, there is no real control to build the object model according to best practices or design patterns, as the model is entirely generated from the database-driven DataSet. The model also uses database naming conventions: the collection is named with a "DataTable" suffix (eg. the Person collection would be "PersonDataTable") and the contained objects are named with a "Row" suffix (the object would be "PersonRow").
  2. There is no HQL-style query language. You are directly addressing a specific database, and, of course, that database is almost exclusively Microsoft SQL Server. And while you can theoretically use queries, stored procedures are considered the norm.
  3. You cannot populate an object graph from a single query/stored procedure. You have two less-than-ideal choices:

    • You can create a query/sproc that contains joins, and bring back tabular data, which is obviously no longer a true object graph, it's just a result set. (I don't use this option.)
    • The second choice is to create SEPARATE queries in each table that forms a part of the object graph. For example, if you wanted to know about the Person, the person's Invoices, the person's Accounts, and the person's Appointments, all of which form the object graph, you would create separate repository queries (by personID) for EACH of the 4 objects, and you would then make 4 calls to populate each object/collection. This also means that you must populate them in the right order (parent first, then child) to prevent constraint errors.

  4. The most difficult is that there is not a single mapping of the object model in a single namespace (package). Instead, a typical .NET project may contain multiple DataSets, often with redundant data tables across DataSets. The generated classes and related repository(table adapter) classes are DataSet specific, so a Person object may end up redundantly as the PersonRow class in MANY different DataSets. And, because they are in distinct DataSets, they cannot be assigned to each other. Finally, a change in a table of the database must be implmented not in a single location, but redundantly across all DataSets that reference the particular data table.
  5. I have had to extend the generated objects with additional functionality in order to link each of the objects in the object graph to the same transaction. And since each of the auto-geneated repository classes do not share a common interface, I have had to force interface implementation. I have achieved this with a .NET convention called "partial classes" (files where you can "add to" (not extend!) an auto-generated class, so that when the auto-generated class gets regenerated, your modifications are not overwritten. It's hackish, but it works.)
In terms of performance, I've found this solution to be quite fast. And, the DataSets can be used as web service parameters and return types, allowing for a minimal amount of web service calls (typically only one per process.)

2 comments:

Ronald Vaughn said...

Nicely made stuff, furthermore really worthwhile thoughts. To see extra material on this theme, one can go to viagra
Khurshid

JAMES DEAN said...

Good post. I was checking continuously this weblog and I’m impressed! Extremely useful info particularly the final part I care for such info lots. I was searching for this certain info for any pretty lengthy time. Thank you and very best of luck.advertising | advertisement | production houses in pakistan | pakistani matrimony