Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. Join them; it only takes a minute:

Sign up
Here's how it works:
  1. Anybody can ask a question
  2. Anybody can answer
  3. The best answers are voted up and rise to the top

I am designing a Invoice table and would like assistance deciding on a couple indexes and the Primary Key.

  1. For the Primary Key - using GUID may sound like a good idea to me, but does this mean the child tables need to have this very GUID as a foreign key as well?

  2. Invoices are always filtered on creation date using a date range. So instead of a datetime column, I plan to have an extra int column which only contain the date information like '20160101' because I assume comparing the int is much faster than datetime and would benefit me. Since the creation time will be ever increasing, I also plan to apply the Cluster Index on this column. Is this a good idea?

  3. Invoices may also be queried based on customer as well, should I put another index on CustomerID or compose with the date column?

share|improve this question

put on hold as unclear what you're asking by ypercubeᵀᴹ, Philᵀᴹ, Paparazzi, Erik, mustaccio 2 hours ago

Please clarify your specific problem or add additional details to highlight exactly what you need. As it's currently written, it’s hard to tell exactly what you're asking. See the How to Ask page for help clarifying this question.If this question can be reworded to fit the rules in the help center, please edit the question.

1  
Which DBMS? SQL Server? Oracle? – Mike Walsh 5 hours ago
    
Sorry no specific, just a discussion of the choices and trade offs. – Yiping 4 hours ago
1  
Joe Celko would probably tear your soul apart if do so... – Nelson Casanova 4 hours ago
2  
Why not just use a date for a date? Let me get this clear you want to use an int for date because you think date is too slow but you are going to GUID for PK over a simple int? – Paparazzi 3 hours ago
    
1. Yes. 2. No. 3. Impossible to say. – mustaccio 2 hours ago

Shorter answer to the title of the question: No. Why would you want to lose the ability to treat a date like a date. So important for sorting, date functions, etc.

A few thoughts to get you started at least, not sure which DBMS while answering, answering based on my experience with SQL Server:

1.) GUID as a Primary Key is generally not a great idea. Not in SQL Server especially. What's wrong with an Integer primary key? Yes whatever your Primary Key is becomes your Foreign Key in the child table(s) so that is large and probably not necessary.

2.) I would use a regular datetime column. Performance wise, if you are well indexed, you shouldn't notice a difference here. And a datetime column is more versatile as, well, a date. You can ask questions of it that you can't ask as easily of an INT column with built in datetime functions. If you don't need time and you are on a DBMS or version that has just the DATE you can use that data type.

3.) Yes. Especially if CustomerID is a foreign key to a Customers table. Good to index. Whether or not you need to make an index on CustomerID AND the Date column depends on how the queries will typically look. If you are often querying joining to customer and specifying a date range you may find it beneficial to have the date. You may find it beneficial to include some other columns to cover other queries as parts of the key or included columns. It really depends on your queries and data, though.

As far as clustering on the date column. That's a hard one. If this were a fact table in a warehouse and every single query was always on a date range, then there are some benefits there. If this is an operational invoice table, I imagine your app also joining into invoices in other approaches also. I also imagine invoices being queried by invoice IDs stored in other tables, etc. So I don't believe there is enough to determine clustered key. I'm of the school that prefers a simple surrogate key for OLTP tables. An InvoiceID INT (or BIGINT if you really would blow out an INT) setup as an identity column so it is always increasing and avoiding page splits. But I don't know if there is a definite wrong answer here (well there are many, but you haven't proposed any of those)

share|improve this answer
    
Thank you. Personally I prefer a INT identity column as PK as well. However I am thinking there are situations when customer want to import old invoices or "secretly insert a yesterday's invoice". An int Identity with cluster will have strange "create date" order.. – Yiping 4 hours ago
    
Not sure that the create date order matters or is a concern here? Guess it depends on what you mean by "create" – Mike Walsh 4 hours ago
1  
@Yiping Then don't use int Identity for create date order. You think a PK of GUID is going to fix that? – Paparazzi 3 hours ago
    
Actually my only concern is whether the rows should be stored based on time (as they are always fetched like that) or the Identity column (sequential number looks better). – Yiping 3 hours ago
    
So the date should be, a date, not an INT. and the identity could be that date but I don't like that for the reasons above and would suggest just using an identity column or perhaps a natural business key. but not an int representation of date. You don't want to try and sort an INT represented date like you would a date. Why cause that struggle. If it were me? I'd probably do what my answer says. InvoiceID INT Identity PK... Date as an actual date or datetime. – Mike Walsh 3 hours ago

Not the answer you're looking for? Browse other questions tagged or ask your own question.