We have an upcoming project in my group that is going to need to be very, very performant. This project will involve us querying for data from a SQL database, transforming that data into strongly-typed objects, then returning those objects to the calling system through a service layer. Eventually, this service will be a central pillar in our organization's service-oriented architecture (SOA), and as such it absolutely has to be fast.
We generally want to use Entity Framework for our ORM, but just
a little searching reveals
StackExchange questions and
blog post after
blog post detailing how EF is simply not up to par for high-performance systems. Into that gap steps so-called "micro-ORMs" like
Dapper.NET (which is used on the StackExchange family of sites including
StackOverflow) which promise performance at the cost of maintainability. As always, we also have the option of using straight ADO.NET queries.
Thing is, because performance needs to be front-and-center in this app, I'd like to be really sure which of these ORMs provide the best bang for my buck. So I worked up a sample project
over on GitHub that takes each of these three data access methods and
beats them till they beg for mercy tests them using the same sample data and same queries (with some caveats, as we'll see below). This post is divided up into the following sections:
Methodology
This test uses a database schema that looks like this:
In other words, a Sport has many Teams, and a Team has many Players.
I needed some sample data to test against. The sample project has
an entire sectiondedicated to producing this data, but suffice to say that you can select how many sports, how many teams per sport, and how many players per team you want for each test.
Now what I needed was a set of queries that I could create in each ORM and test against. I chose three different queries:
- Player by ID
- Players per Team
- Teams per Sport (including Players)
For each Query, I will run the test against all data in the database (e.g. for Player by ID I will select each player by their ID) and average the total time it takes to execute the query (including setting up the DbContext or SqlConnection, as the case may be) for each execution. Then, I will do multiple runs of this over the same data so that I can average them out and get a set of numbers that should clearly show which of the ORMs is the fastest.
Test Setup
As an example, here's the code for the Entity Framework, ADO.NET, and Dapper.NET test classes test class:
public class EntityFramework : ITestSignature
{
public long GetPlayerByID(int id)
{
Stopwatch watch = new Stopwatch();
watch.Start();
using (SportContext context = new SportContext())
{
var player = context.Players.Where(x => x.Id == id).First();
}
watch.Stop();
return watch.ElapsedMilliseconds;
}
public long GetPlayersForTeam(int teamId)
{
Stopwatch watch = new Stopwatch();
watch.Start();
using (SportContext context = new SportContext())
{
var players = context.Players.Where(x => x.TeamId == teamId).ToList();
}
watch.Stop();
return watch.ElapsedMilliseconds;
}
public long GetTeamsForSport(int sportId)
{
Stopwatch watch = new Stopwatch();
watch.Start();
using (SportContext context = new SportContext())
{
var players = context.Teams.Include(x=>x.Players).Where(x => x.SportId == sportId).ToList();
}
watch.Stop();
return watch.ElapsedMilliseconds;
}
}
public class ADONET : ITestSignature
{
public long GetPlayerByID(int id)
{
Stopwatch watch = new Stopwatch();
watch.Start();
using(SqlConnection conn = new SqlConnection(Constants.ConnectionString))
{
conn.Open();
using(SqlDataAdapter adapter = new SqlDataAdapter("SELECT Id, FirstName, LastName, DateOfBirth, TeamId FROM Player WHERE Id = @ID", conn))
{
adapter.SelectCommand.Parameters.Add(new SqlParameter("@ID", id));
DataTable table = new DataTable();
adapter.Fill(table);
}
}
watch.Stop();
return watch.ElapsedMilliseconds;
}
public long GetPlayersForTeam(int teamId)
{
Stopwatch watch = new Stopwatch();
watch.Start();
using(SqlConnection conn = new SqlConnection(Constants.ConnectionString))
{
conn.Open();
using(SqlDataAdapter adapter = new SqlDataAdapter("SELECT Id, FirstName, LastName, DateOfBirth, TeamId FROM Player WHERE TeamId = @ID", conn))
{
adapter.SelectCommand.Parameters.Add(new SqlParameter("@ID", teamId));
DataTable table = new DataTable();
adapter.Fill(table);
}
}
watch.Stop();
return watch.ElapsedMilliseconds;
}
public long GetTeamsForSport(int sportId)
{
Stopwatch watch = new Stopwatch();
watch.Start();
using(SqlConnection conn = new SqlConnection(Constants.ConnectionString))
{
conn.Open();
using(SqlDataAdapter adapter = new SqlDataAdapter("SELECT p.Id, p.FirstName, p.LastName, p.DateOfBirth, p.TeamId, t.Id as TeamId, t.Name, t.SportId FROM Player p INNER JOIN Team t ON p.TeamId = t.Id WHERE t.SportId = @ID", conn))
{
adapter.SelectCommand.Parameters.Add(new SqlParameter("@ID", sportId));
DataTable table = new DataTable();
adapter.Fill(table);
}
}
watch.Stop();
return watch.ElapsedMilliseconds;
}
}
public class Dapper : ITestSignature
{
public long GetPlayerByID(int id)
{
Stopwatch watch = new Stopwatch();
watch.Start();
using (SqlConnection conn = new SqlConnection(Constants.ConnectionString))
{
conn.Open();
var player = conn.Query<PlayerDTO>("SELECT Id, FirstName, LastName, DateOfBirth, TeamId FROM Player WHERE Id = @ID", new{ ID = id});
}
watch.Stop();
return watch.ElapsedMilliseconds;
}
public long GetPlayersForTeam(int teamId)
{
Stopwatch watch = new Stopwatch();
watch.Start();
using (SqlConnection conn = new SqlConnection(Constants.ConnectionString))
{
conn.Open();
var players = conn.Query<List<PlayerDTO>>("SELECT Id, FirstName, LastName, DateOfBirth, TeamId FROM Player WHERE TeamId = @ID", new { ID = teamId });
}
watch.Stop();
return watch.ElapsedMilliseconds;
}
public long GetTeamsForSport(int sportId)
{
Stopwatch watch = new Stopwatch();
watch.Start();
using (SqlConnection conn = new SqlConnection(Constants.ConnectionString))
{
conn.Open();
var players = conn.Query<PlayerDTO, TeamDTO, PlayerDTO>("SELECT p.Id, p.FirstName, p.LastName, p.DateOfBirth, p.TeamId, t.Id as TeamId, t.Name, t.SportId FROM Team t "
+ "INNER JOIN Player p ON t.Id = p.TeamId WHERE t.SportId = @ID", (player, team) => { return player; }, splitOn: "TeamId", param: new { ID = sportId });
}
watch.Stop();
return watch.ElapsedMilliseconds;
}
}
Note that in Dapper.NET and ADO.NET's cases, we will be selecting a row for each Player in the GetTeamsForSport query. This is not an exact comparison against the EF query, but for my purposes it works fine.
Results
The following results are for 10 iterations, each containing 8 sports, 30 teams in each sport, and 100 players per team.
Entity Framework Results
Run | Player by ID | Players for Team | Teams for Sport |
1 | 1.64ms | 4.57ms | 127.75ms |
2 | 0.56ms | 3.47ms | 112.5ms |
3 | 0.17ms | 3.27ms | 119.12ms |
4 | 1.01ms | 3.27ms | 106.75ms |
5 | 1.15ms | 3.47ms | 107.25ms |
6 | 1.14ms | 3.27ms | 117.25ms |
7 | 0.67ms | 3.27ms | 107.25ms |
8 | 0.55ms | 3.27ms | 110.62ms |
9 | 0.37ms | 4.4ms | 109.62ms |
10 | 0.44ms | 3.43ms | 116.25ms |
Average | 0.77ms | 3.57ms | 113.45ms |
ADO.NET Results
Run | Player by ID | Players for Team | Teams for Sport |
1 | 0.01ms | 1.03ms | 10.25ms |
2 | 0ms | 1ms | 11ms |
3 | 0.1ms | 1.03ms | 9.5ms |
4 | 0ms | 1ms | 9.62ms |
5 | 0ms | 1.07ms | 7.62ms |
6 | 0.02ms | 1ms | 7.75ms |
7 | 0ms | 1ms | 7.62ms |
8 | 0ms | 1ms | 8.12ms |
9 | 0ms | 1ms | 8ms |
10 | 0ms | 1.17ms | 8.88ms |
Average | 0.013ms | 1.03ms | 8.84ms |
Dapper.NET Results
Run | Player by ID | Players for Team | Teams for Sport |
1 | 0.38ms | 1.03ms | 9.12ms |
2 | 0.03ms | 1ms | 8ms |
3 | 0.02ms | 1ms | 7.88ms |
4 | 0ms | 1ms | 8.12ms |
5 | 0ms | 1.07ms | 7.62ms |
6 | 0.02ms | 1ms | 7.75ms |
7 | 0ms | 1ms | 7.62ms |
8 | 0ms | 1.02ms | 7.62ms |
9 | 0ms | 1ms | 7.88ms |
10 | 0.02ms | 1ms | 7.75ms |
Average | 0.047ms | 1.01ms | 7.94ms |
Analysis
As we can see in the data above Entity Framework is markedly slower than either ADO.NET or Dapper.NET, on the order of 3-10 times slower.
Let's be clear: the methodology used in this test had something to do with this, particularly the "Teams per Sport" query. In that query, Entity Framework was selecting both the teams in a given sport and the players involved with each team (via an Include() statement), whereas the ADO.NET and Dapper.NET queries were just selecting joined data. In a more rigorous statistical study, the test would either be improved or these results would be thrown out.
What's more interesting to me is that Dapper.NET was, on average, faster than ADO.NET for the more complex queries. It appears to me that there is a performance hit the first time you use Dapper.NET (as also appears to happen with EF) but once you get past that, Dapper.NET is amazingly fast. I suspect that this has something to do with the fact that in the ADO.NET test cases we are using a SqlDataAdapter, though I cannot prove this.
Even if you do throw out the "Teams per Sport" query, you're still left with EF being at least 3 times slower than either Dapper.NET or ADO.NET. The data shows that, at least in terms of raw speed and with these queries, Entity Framework will be the slowest option, and Dapper.NET will (narrowly) be the fastest. Which is why my ultimate conclusion might surprise you.
Conclusion
We're going to use Dapper.NET on our project; that much is not in doubt. However, we're not going to start development with it, and it will not be the only ORM in use. The plan is to develop this project using Entity Framework, and later optimize to use Dapper.NET in certain scenarios where the system needs a performance boost. Yes, we are going with the slowest option to start. Why would we do this?
Because the major drawback to using Dapper.NET is that you have naked SQL queries in your code. If anybody fat-fingers anything, we won't be aware of any issues until we run the tests against the code. Plus, the members of my group are more familiar with EF than Dapper.NET, and therefore development time will be quicker.
In short,
Dapper.NET is unquestionably faster than EF and slightly faster than straight ADO.NET, but we'll be doing the majority of development in EF and then optimizing with Dapper.NET where needed. We think this will strike a balance between ease of development and performance (and hopefully allow us to both
get it done and do it right).
Don't believe me? Good! Go
get the app and see for yourself!
(Also, if you see anything obviously wrong in my methodology, tell me about it! I'd be happy to update this post and the sample project if problems are found in it.)
Happy Coding!
Không có nhận xét nào:
Đăng nhận xét