Making the Case for Entity Framework in the Enterprise
Recently, I was met with some friction by the IT department at a client where, they asserted, that a decision had been made years ago to ban Entity Framework. Like many enterprise environments, this client was understandably concerned with the potential pitfalls of embracing Entity Framework. That meant that my job was to convince them otherwise – not to discount their apprehension, but quite the contrary – to demonstrate how EF can be leveraged for its advantages, and avoided for its shortcomings.
Entity Framework (EF) is a broad framework with many optional parts. There are several aspects of EF that provide great benefit, while others are a source of great consternation – particularly from the perspective of the database purist. As the cliché goes, “with great power comes great responsibility,” and so this blog post explores different aspects of EF, and identifies how to best leverage the framework, and not abuse it.
There are two distinct developer experiences with EF.
- Design time
- Run time
EF at Design Time
At design time, there are many different ways that developers can choose to use EF. All of them typically involve code generators, which could be any combination of automatically generated C# code (POCO classes, stored procedure wrappers) and T-SQL code (DDL).
The exact type of code generated depends on the methodology employed by the application developer. These are the set of “xxx-first” approaches to database design, which include:
Database-first is the only acceptable methodology for SQL Server purists. It essentially means that the database comes first, and that EF has no influence on the database design. EF does not generate DDL to create the database, and there is no EF “smell” in the design. Instead, the SQL Server expert implements the database by hand, applying all their experience, and using every best practice there is. They create the database with properly normalized tables, data types, foreign key constraints, unique constraints, check constraints, stored procedures, naming conventions, and so on. And, ideally, they build and maintain the database design using SQL Server Data Tools (SSDT) in Visual Studio with source control (TFS, GIT, etc.), rather than by managing scripts in SSMS.
With the other methodologies (model-first and code-first), the application developer focuses on their logical model (either as a visual .edmx file, or as POCO classes in C# code), and lets the EF design time “figure out the database” from the model. They let EF generate the DDL, which in turn creates the database. All SQL Server data types are inferred from their closest .NET equivalent. Then, as the model evolves, EF generates code migrations to rev-up and rev-down the database.
This approach is typically rejected by database purists, who rightfully object to the database being “reverse-engineered” from a model (.edmx file), or from C# POCO classes. With model-first and code-first, all direct control over the database design is lost. Thus, the only way to control the database design is indirectly, by coercing EF’s design-time DDL generator through the use of naming conventions or [attributes]. Essentially, it means that the application developer, who often lacks the experience and skills needed to properly design a database, gets to design the database. This is a terrible way to build and evolve an enterprise database model.
Our client’s application is strictly database-first, and uses SSDT to centrally store the database design, with full version history in TFS. Furthermore, it leverages a unique open-source T4 template code generator that enables database-first with an extensible object model, and zero friction for migrations. I have personally made several contributions to this utility, which is called the EntityFramework Reverse POCO Generator.
This generator produces C# POCO classes that are always in sync with the tables in the database, and it creates these as partial classes, so that the C# developer can easily build extensions that survive code regeneration. It also generates strongly typed stored procedure wrapper methods, which promotes and simplifies the use of stored procedures at run time.
EF at Run Time
While every developer design time experience may vary, the run time experience is always the same.
- Connect to the database with a context
- Retrieve objects by querying the context
- Modify objects in memory, while the context tracks changes
- Save modified objects back to the database
Here, database purists have additional concerns, particularly with item #2. Specifically how is the database being queried? It could be either direct SQL or a stored procedure. When EF is used to call a stored procedure, then it’s no different than calling that stored procedure using conventional ADO.NET or Enterprise Library. Whether it’s an EF DbContext object or an old-fashioned SqlCommand object, the query is prewritten and embedded in the stored procedure.
If, on the other hand, the developer uses EF to execute a LINQ query, then the EF runtime dynamically generates direct SQL. This can potentially (and all to often does) result in poor queries being sent to SQL Server by the application. I stress potentially because, a) the dynamically generated SQL queries have in fact improved greatly over the many versions of EF, and b) just because a stored procedure is being used instead doesn’t necessarily mean better performance; for example, if the stored procedure has a badly written query inside of it.
It’s similar to the design-time concern; an application developer that lacks experience in SQL Server writes a LINQ query in C# that results in generated SQL that performs poorly. All to often, these developers don’t understand deferred execution in EF, and they can repeatedly execute the same query over and over inside a loop (rather than just once) without realizing it. They are also prone to just append one .Include after another in their LINQ query, resulting in a great number of JOINs that returns a lot of duplicate parent data in the resultset.
EF has excellent run time support for stored procedures, and, as mentioned, the T4 template code generator provides the C# developer with ready-to-use methods that wrap every stored procedure. The vast majority of data access against the database should occur by calling stored procedures. Like the database modeling at design time, these stored procedures should be written by hand with careful attention paid to SQL Server best practices. However, there may be some lightweight cases (for example, simple single-row, single-table SELECTs) where you could allow EF to generate the query without a stored procedure, where the generated query is not objectionable.
Furthermore, you can implement a SelectEntity stored procedure that returns an entire entity as XML. This not only means using a stored procedure and not EF to query the database, but that the stored procedure will produce the most lightweight resultset without all the duplicate parent data that would result with traditional JOINs in the stored procedure.
One of the biggest client-side advantages that EF provides is change tracking. After retrieving and deserializing a complete entity via the aforementioned stored procedure, you can attach it to the EF context so that EF can track changes to it, just the same as it would if EF queried the database without the stored procedure. Then, as you manipulate the object in memory, the EF context keeps track of what rows in which tables need to be INSERTed, UPDATEd, or DELETEd. These are complex details that are very difficult to implement by hand, while allowing the context to figure that all out dramatically streamlines your development process.
In conclusion, your philosophy should be that best SQL Server practices are followed, both at design time with respect to data modeling, and at run time with respect to querying. You should take a “pick-and-choose” approach to EF, embracing those aspects that provide compelling developer benefits, and rejecting those that run contrary to proper database design and query principles.
Thus, your application should take a database-first approach to data modeling, and use stored procedures exclusively to query the database, ensuring no EF “smell” at either design time or run time.