Tallan's Technology Blog

Tallan's Top Technologists Share Their Thoughts on Today's Technology Challenges

The Cost of Reflection

dstrickland

Reflection has long been a point of contention among the development community.On the one hand you have the elegance purists that insist that Reflection is the panacea that ensures we write boilerplate once and only once, and on the other you have the performance purists who insist that Reflection is too slow to be considered for a real enterprise solution.

So is either side right? Well, IMHO the truth is that it depends. Let me start by saying, I find myself torn. I really sort of feel like I am a staunch advocate of both camps if that is possible. I love finding a way to avoid writing and maintaining repetitive boilerplate code. Furthermore, I love designing frameworks that service whole domains in an abstract way. On the other hand, I love hearing a user say, “Gosh, that was fast. The old system used to take forever.”span>

So, if it can be posited that both goals are worthy of pursuit, can it also be accepted that there is a time and place for both?And if that can be accepted, can we then outline a matrix for deciding when and where to take each approach?To make this conversation a little more tangible, let’s illustrate a key area where Reflection vs. Static code is being debated:data mapping or Object-Relational Mapping.

The concerns that drive structures in an RDMS are in many ways orthogonal to the structures we may want to implement in the domain of the application.For the sake of convenience, we will adopt the term Physical Model when referring to the structures the RDMS uses to store data, and we will adopt the term Logical Model when referring to the structures the domain layer exposes to application code.

So when your physical model diverts from your logical model, or you even just want to use custom objects instead of directly using ADO.NET types, you are going to have to find a way to translate values back and forth between your custom types and the types ADO.NET uses to retrieve and submit data. Historically, this has been accomplished via manual mapping of DataRow columns to object properties in a manner such as this.

Listing 1

DataRow dataRow = GetDataRow(1);

List<Person> persons = new List<Person>();

for (int counter = 0; counter < upperBound; counter++) {

Person person = new Person();

person.Id = (int)dataRow[“Id”];

person.UpdatedUserId = (int)dataRow[“UpdatedUserId”];

person.FamilyId = (int)dataRow[“FamilyId”];

person.CompanyId = (int)dataRow[“CompanyId”];

person.DateModified = (DateTime)dataRow[“DateModified”];

person.DateOfBirth = (DateTime)dataRow[“DateOfBirth”];

person.FirstName = dataRow[“FirstName”].ToString();

person.LastName = dataRow[“LastName”].ToString();

person.MiddleInitial = dataRow[“MiddleInitial”].ToString();

person.Ssn = dataRow[“Ssn”].ToString();

person.Suffix = dataRow[“Suffix”].ToString();

person.Title = dataRow[“Title”].ToString();

persons.Add(person);

}

The reflective camp would accomplish the same task relying heavily on AOP and object attributes.A possible implementation of the same process might look something like this. Listing 2 shows the application code that requests a mapped object and Listing 3 shows the DataStore class that accomplishes the mapping.

Listing 2

List<Person> persons = new List<Person>();

for (int counter = 0; counter < upperBound; counter++) {

Person person = DataStore.Load<Person>(1);

persons.Add(person);

}

Listing 3

public static class DataStore {

public static T Load<T>(int id) {

T obj = Activator.CreateInstance<T>();

Map<T>(id, ref obj);

return obj;

}

private static void Map<T>(int id, ref T obj) {

if (!RefUtil.HasAttribute<MappableAttribute>(obj))

throw new InvalidOperationException(“Cannot map types not marked as Mappable.”);

DataRow row = DataProvider.Single<T>(id);

PropertyInfo[] properties = obj.GetType().GetProperties();

foreach (PropertyInfo prop in properties) {

string columnName = GetPropertyColumnName(prop);

object val = row[columnName];

if (val != null) {

prop.SetValue(obj, row[columnName], null);

}

}

return;

}

private static string GetPropertyColumnName(PropertyInfo prop) {

string columnName = “”;

if (RefUtil.HasAttribute<ColumnAttribute>(prop)) {

ColumnAttribute column = (ColumnAttribute)RefUtil.GetAttribute<ColumnAttribute>(prop);

columnName = column.Name;

}

else {

columnName = prop.Name;

}

return columnName;

}

}

While the code necessary to accomplish the same task is less direct and contains more lines, it will service any object that is configured with the correct attributes.The example in Listing 1 would have to be reproduced for every persistent class in the domain.The savings in the reflective approach comes in code creation and maintenance.The developer now does not need to manually write the code that sets object properties, thus avoiding the potential of missing one, and if the class changes, the new or omitted properties should dynamically be mapped without the need to update a mapping method.So from a maintenance perspective, I have eliminated code that has to be manually managed, and in addition to that benefit, the type of code that I have obviated is the type of code that lends itself to omission errors.Now imagine a domain with 300 entities.That is a LOT of code I don’t have to maintain, and a lot of tests that I don’t have to maintain since I have one load method on one object instead of one load method per type.

Well that all sounds good, but what about the question of the performance impact all this great reflective code is having on my program?Well, the data there is interesting.

In order to test the impact reflection has, I created a simple console application that calls each of the above code examples a number of times. On start up, the console will create one object of each type and display the time it took to complete the operation in milliseconds. From that point forward the user can type in a number of instances to create and the harness will run each method that number of times and then display the time to execute to the console. Here are the results. Remember times are in milliseconds.

Listing 4

Number of Iterations

Static Mapping(ms)

Reflective Mapping(ms)

1

<1

31

1

<1

<1

10

1

1

100

1

10

1000

11

98

10000

110

954

100000

1176

9772

Now, one thing to state here is that my test harness does not actually perform database calls.In order to keep the test model free of exogenous influence, no actual database calls were made.Instead, a mock object was used to fabricate a DataRow to be used in the mapping processes.The end result is that this test is a straight comparison of the mapping code without the vagaries of what the network looked like at the time to muddle things up.

So, what does this tell us?For starters, let’s consider the first two runs.I mentioned earlier that the first thing the test harness will do is run each method once and report the times, which is does for us.The first run indicates that the reflective load took 31ms to run and the Static load took sub 1 ms.This is great info, but it ignores the affect that Just In Time Compilation has on the test.So, to get an apples to apples comparison I enter the number 1 again and see that this time both methods are reporting sub 1 ms execution times.

So to paraphrase Emeril, I decided to kick it up a notch.I ran the program again for 10, 100, 1000, 10,000, and 100,000 iterations.Interestingly, what we see are consistent 10 to 1 ratios, where the static mapping based solution outperforms the reflection based solution consistently.

So on a purely throughput level, the static approach seems best, while on a maintainability and management level, the reflective solution seems to have clear advantages. Interestingly enough, this tool turns out to give us some interesting insight into how to make the decision. While it is true that the reflective solution is slower, the slowness is relative. In a desktop application, where you can make use of the client processing power, the user will likely never perceive the difference in the two speeds until you are loading more than 1000 objects at a time. The probability that you are loading more than a thousand objects at a time is pretty low in a GUI application with the possible exception of a tabular data scenario, but even there you are likely to implement paging. Additionally, in list based scenarios you may opt to bind natively to the ADO.NET object and only translate that into a custom type when an item is selected from the list and you are ready to do actual business logic on the selected item. Regardless of the route you select, in remains generally reasonable to state that in a typical desktop GUI application you are unlikely to hit a threshold of objects retrieved in a single batch that would mandate the use of static mapping for performance reasons.

That said, what about server side code? Well now that gets interesting. What kind of server side application is it? A web site? A WCF service? A Windows Service? Who are the clients? What is the expected transaction volume? How many processors/cores are available? How heavily is the current machine taxed? Do clients expect a response or are they fire and forget? Is object caching a viable option?

All of these questions, and more, can lend insight as to what level of performance is necessary and whether a reflection based solution is an option.As a summary I have created a checklist of concerns I consider when deciding which way to go with a solution.The combination of answers to this question will typically give me a very clear nudge in one direction or the other.

Application Type or Exacerbating Factor

Static Loading

Reflective Loading

Website

Use with high traffic sites

Moderate to low volume sites should be viable

WCF Service

Use with high transaction volume services especially if client calls are synchronous

Use with moderate to low volume services and with asynchronous clients

Windows Service

Start by asking why a Windows Service is making database calls. If the answer is valid, then consider the volume of transactions.

Start by asking why a Windows Service is making database calls. If the answer is valid, then consider the volume of transactions.

SmartClient or Client/Server solution

Treat as a WCF Service if the mapping takes place server side.

Best case argument for the use of Reflective Loading. Code can make use of the client processing power.

Human User

Use if loading large object graphs (>100 object instances) in a single pass.

Viable for most data loading scenario.

Automated User

Use is transaction volume is high.

Viable for moderate to low transaction volume scenarios.

Machine Heavily Taxed

Use static mapping.

Do not use reflection.

Machine Moderate to Lightly Taxed

Viable

Viable

Synchronous Client

Viable

Viable, but must constantly monitor user perceived response time.

Asynchronous Client

Viable, but in most situations this model is the least in need of the performance boost.

Viable and appropriate.

Object Caching Viable

Improves performance significantly. In high transaction volume scenarios, a combination of object caching and static mapping make for a very snappy solution.

Significantly increases the performance and may make this approach more broadly applicable since the reflection costs are somewhat minimized to the initial load.

I hope you found this post useful. Please post your comments if you have anything to add or want to give your own 2 cents.

3 Comments. Leave new

I like this topic, since the whole Reflection issue has come up ever since any metadata based language such as Java and C# have been developed.

Many frameworks such as Spring, Hibernate and many other ORM for database mapping use reflection, so there must be a merit to using reflection in proper manner.

Especially when using reflection to assign a value to a variable, most of people use class inspection and iterator to iterate through the properties and assign values accordingly.

The funny part is that most of the process power and time wasted happens during the property/method iteration.

This can easily resolved by caching the properties/methods information in a shared cache. If the cache is empty or stale, then iterate over the list of properties and store the information to cache.

Once the cache has the information, we can just easily do a simple value assignment without going through the class type inspectino via reflection again.

just my 2c. :)

-SK

derek.strickland
August 19, 2008 5:38 am

Thanks for the comment. Very good point on the property caching. I do that with the SQL statements that are used to query the DB so as to limit the amount of string manipulation that gets done, it should be a simple enough to apply the same principle to the PropertyInfo’s.

I too agree that there must be a merit to using it. Microsoft makes heavy use of Reflection techniques in many of their approaches to problem solving in 3.5,. so it looks like a strategy we should learn more about.

Thanks for reading and commenting!

Good article. Reflection really is an interesting topic because it’s not a simple question of being slow or fast – it tends to force you to think about the whole situation, and start thinking about strategic points in the app where you can leverage the levels of usage and refresh requirements.

Being that performance, maintenance, and repetitive code are the key factors here, I’ve found that a discussion of code generation enters the room quite gracefully. Past projects I’ve worked on that have leveraged code generation (and regeneration) always seem to have arrived there by either trying to eliminate the writing of repetitive, code like property accessors or maps, or by looking at code generation as a sort of caching mechanism that does its work pre-compile-time as opposed to doing it at runtime.

Code Generation can add that third dimension to your matrices above, allowing you to provide the performance of the static loading method, while being able to do so (with some required foresight) in a maintainable manner.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>