There is a lot of confusing terminology in this question. You mix up discrete time and continuous time terminology. I will do my best to clear things up and provide an answer. First of all, the Dirac delta function is a generalized function defined for a continuous variable (typically frequency or time for our purposes). To say that the DFT (which maps discrete sequences in time to discrete sequences in frequency) yields the Dirac delta function makes no sense. It is possible that the DFT of a sequence can lead to the Kronecker delta function (or simply unit impulse function). One may think of the Kronecker delta as serving the same purpose in discrete time that the Dirac delta function serves in continuous time (however, the Kronecker delta is much simpler to understand).
Also, I would be careful stating that the spectrum resulting from zero padding is a sinc function. I will use the term sinc-like, because it is not exactly a sinc function (this can be proved analytically). It is something quite close to a sinc function (but periodic with period $2\pi$ as required for the output of a DFT).
Okay, so now that we have that out of the way. I will try to make some sense of what is being asked and address the main points of the question.
Let's consider the first case. We have finite values of $e^{j \omega_0 t}$. I think when you say finite values you really mean that we are considering the sampled version of the signal. Even time-limited versions of $e^{j \omega_0 t}$ have an infinite number of values. On the other hand, a time-limited and sampled version has a finite number of values. This seems to be what you are referring to.
In the first case the result is the Dirac delta function...
Even if we replace the word Dirac with Kronecker, this assertion is (in general) false. To see this, let's consider the samples of a complex exponential. The continuous time function is given as
$$ x(t) = e^{j \omega_0 t}, $$
and the sampled version with sampling period $T$ is given by
$$ x[n] = e^{j \omega_0 n T}. $$
Now, the shape of an $N$-point DFT is different for different values of $\omega_0 T$. If $\omega_0 T$ is equal to $\frac{2 \pi k}{N}$ where $k$ is an arbitrary integer, then the DFT results in the Kronecker delta. If, on the other hand, $\omega_0 T = \frac{2 \pi k + \epsilon}{N}$ where $0 < \epsilon < 1$, you will see a sinc-like result in the DFT output. This fact contradicts your first claim.
The reason for these different outputs, is that both results are sampled versions of the the same (continuous-valued) spectrum. If you don't understand the difference between the Continuous Time Fourier Transform (CTFT), the Discrete Time Fourier Transform (DTFT) and the Discrete Fourier Transform (DFT), now would be a good time to read about it. The very short version is that the DTFT yields the (continuous-valued) spectrum of a sequence (i.e., a sampled signal). The DFT is a sampled version of the DTFT over a finite set of samples (i.e., a time-domain window). On the other hand, the CTFT deals with continuous time signals (in general existing for all time). There is a lot more to all of this, and I recommend you read more. For now, that's the gist of what we need.
In both of the cases above, the spectrum of the DTFT applied to the windowed version of the complex exponential is a sinc-like response centered at $\omega_0$. The reason they appear different in the DFT output obtained is because of where the DTFT was sampled to obtain the corresponding DFT. When $\omega_0 T = \frac{2 \pi k}{N}$, an integer number of periods of the complex exponential fits into one DFT window. This results in the frequency domain output of the DFT being samples of the sinc-like function at its peak and zero crossings. This means you have one non-zero output and the rest are zeros (i.e., the Kronecker delta). In the case that you do not have an integer number of periods in one DFT window, you do not get this nice single nonzero sample, and so you see more of the shape of the sinc-like response in the nonzero samples.
...while in the second case the result is a sinc function...
In this second case, we have zero padded the signal. Recall that the original DFT resulted in the Kronecker delta at the bin centered at $\omega_0 T$ because we were sampling the sinc-like function at its peak and all of its zero crossings. There are no more zero crossings to be sampled, so as we add more frequency bins to our calculation, the resulting DFT can be viewed as a finer sampling of the same DTFT spectrum (with the same sampling period and time-limited window). These additional samples will be non-zero (since there were no more zero crossings) and the result will resemble the same sinc-like spectrum that you have observed.
...where we know that zero padding only enhances the estimate of amplitude.
I think you are missing something here, because this seems to have no bearing on the rest of the question. Perhaps, you can clarify your intent?