Since varchar anyways allocate space dynamically, my question is whether using varchar(255) be more performant or save more space compared to using varchar(5000). If yes, why?

Thanks

edit: Looks like this question has been asked before.

share|improve this question
    
Do you need a 5000 character wide column? If so WHY? Would a varchar (MAX) column work better for you here? – Richard L. Dawson 11 hours ago
    
Looks like this question has been asked before. I found the answer here. – Tintin 11 hours ago

Yes, varchar(5000) can be worse than varchar(255) if all values will fit into the latter. The reason is that SQL Server will estimate data size and, in turn, memory grants based on the declared (not actual) size of the columns in a table. When you have varchar(5000), it will assume that every value is 2,500 characters long, and reserve memory based on that.

Here's a demo from my recent GroupBy presentation on bad habits that makes it easy to prove for yourself (requires SQL Server 2016 for some of the sys.dm_exec_query_stats output columns, but should still be provable with SET STATISTICS TIME ON or other tools on earlier versions); it shows larger memory and longer runtimes for the same query against the same data - the only difference is the declared size of the columns:

-- create three tables with different column sizes
CREATE TABLE dbo.t1(a nvarchar(32),   b nvarchar(32),   c nvarchar(32),   d nvarchar(32));
CREATE TABLE dbo.t2(a nvarchar(4000), b nvarchar(4000), c nvarchar(4000), d nvarchar(4000));
CREATE TABLE dbo.t3(a nvarchar(max),  b nvarchar(max),  c nvarchar(max),  d nvarchar(max));
GO -- that's important
-- populate them with a bunch of rows, 100 times
INSERT dbo.t1(a,b,c,d) SELECT TOP (5000) [IsDuduHappyNow?] = LEFT(name,1), 
  RIGHT(name,1), ABS(column_id/10), ABS(column_id%10)
FROM sys.all_columns ORDER BY object_id;
GO 100
INSERT dbo.t2(a,b,c,d) SELECT a,b,c,d FROM dbo.t1;
INSERT dbo.t3(a,b,c,d) SELECT a,b,c,d FROM dbo.t1;
GO

-- no "primed the cache in advance" tricks
DBCC FREEPROCCACHE WITH NO_INFOMSGS;
DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS;
GO

-- run the same query against all three tables
GO
SELECT DISTINCT a,b,c,d, DENSE_RANK() OVER (PARTITION BY b,c ORDER BY d DESC)
FROM dbo.t1 GROUP BY a,b,c,d ORDER BY c,a DESC;
GO
SELECT DISTINCT a,b,c,d, DENSE_RANK() OVER (PARTITION BY b,c ORDER BY d DESC)
FROM dbo.t2 GROUP BY a,b,c,d ORDER BY c,a DESC;
GO
SELECT DISTINCT a,b,c,d, DENSE_RANK() OVER (PARTITION BY b,c ORDER BY d DESC)
FROM dbo.t3 GROUP BY a,b,c,d ORDER BY c,a DESC;
GO

SELECT [table] = N'...' + SUBSTRING(t.[text], CHARINDEX(N'FROM ', t.[text]), 12) + N'...', 
s.last_dop, s.last_elapsed_time, s.last_grant_kb, s.max_ideal_grant_kb
FROM sys.dm_exec_query_stats AS s CROSS APPLY sys.dm_exec_sql_text(s.sql_handle) AS t
WHERE t.[text] LIKE N'%dbo.'+N't[1-3]%'
ORDER BY t.[text];

Right-size your columns, please.

Also, I re-ran the tests with varchar(32), varchar(255), varchar(5000), varchar(8000), and varchar(max). Similar results (click to enlarge), though differences between 32 and 255, and between 5,000 and 8,000, were negligible:

enter image description here

Here's another test with the TOP (5000) change for the more fully reproducible test I was being incessantly badgered about (click to enlarge):

enter image description here

So even with 5,000 rows instead of 10,000 rows (and there are 5,000+ rows in sys.all_columns at least as far back as SQL Server 2008 R2), a relatively linear progression is observed - even with the same data, the larger the defined size of the column, the more memory and time are required to satisfy the exact same query (even if it does have a meaningless DISTINCT).

share|improve this answer
    
this is really surprising. Would the difference between varchar(450) and varchar(255) be the same? (Or anything below 4000?) – a_horse_with_no_name 11 hours ago
    
@a_horse_with_no_name I haven't tested all permutations of runtime performance, but the memory grant would be a linear progression - it is simply a function of rowcount*(column_size/2). – Aaron Bertrand 11 hours ago
    
That's pretty disappointing then. I though that modern versions of SQL Server don't suffer from that (as long as the defined length is smaller then 8000 or maybe 4000). – a_horse_with_no_name 11 hours ago
2  
@Dudu Great, why don't you prove that in your own answer instead of puffing up your chest and hijacking mine. Let me know when you come up with some magical number of rows in sys.all_columns that leads to no performance differences here. I'll wait. This answer demonstrates the problem, whether you want to nit-pick my methods or not. – Aaron Bertrand 10 hours ago
1  
@Dudu Please stop hyper-focusing on irrelevant details that DO. NOT. MATTER. Like DISTINCT, it is also meaningless, never mind childish. – Aaron Bertrand 9 hours ago

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.