Truely unique file name.

  • Thread starter Thread starter Mufasa
  • Start date Start date
M

Mufasa

What's an easy way to create a truely unique file name in a specified
directory? I could use Path.GetTempFileName but I want to be able to specify
the directory to create the file in.

TIA - Jeff.
 
Jeff,

You can call the GetTempFileName function through the P/Invoke layer.
It allows you to specify the path for the filename to be created in. Note
that this will create the actual file (just as the call to
Path.GetTempFileName does).
 
I've never done P/Invoke. How does that work? Do you have a link I can look
at?

TIA - Jeff.

Nicholas Paldino said:
Jeff,

You can call the GetTempFileName function through the P/Invoke layer.
It allows you to specify the path for the filename to be created in. Note
that this will create the actual file (just as the call to
Path.GetTempFileName does).


--
- Nicholas Paldino [.NET/C# MVP]
- (e-mail address removed)

Mufasa said:
What's an easy way to create a truely unique file name in a specified
directory? I could use Path.GetTempFileName but I want to be able to
specify the directory to create the file in.

TIA - Jeff.
 
Jeff,

Here is the link to the declaration you will use:

http://www.pinvoke.net/default.aspx/kernel32/GetTempFileName.html

Note that you will have to set the capacity of the StringBuilder that
you pass in before you make the call. I believe that for GetTempFileName,
you need space for 14 characters.

For a general tutorial on P/Invoke, I'd start here:

http://msdn.microsoft.com/en-us/library/aa288468.aspx


--
- Nicholas Paldino [.NET/C# MVP]
- (e-mail address removed)

Mufasa said:
I've never done P/Invoke. How does that work? Do you have a link I can
look at?

TIA - Jeff.

Nicholas Paldino said:
Jeff,

You can call the GetTempFileName function through the P/Invoke layer.
It allows you to specify the path for the filename to be created in.
Note that this will create the actual file (just as the call to
Path.GetTempFileName does).


--
- Nicholas Paldino [.NET/C# MVP]
- (e-mail address removed)

Mufasa said:
What's an easy way to create a truely unique file name in a specified
directory? I could use Path.GetTempFileName but I want to be able to
specify the directory to create the file in.

TIA - Jeff.
 
Unfortunately, that's not going to work, as it's very possible that the
file already exists. Calling GetTempFileName (on the path object, or
through the API) will create the file, guaranteeing that there will not be a
conflict with a name that already exists.
 
Nicholas Paldino [.NET/C# MVP] wrote:
[on generating a temporary file name with a GUID]
Unfortunately, that's not going to work, as it's very possible that the
file already exists.

The chance the file already exists is equal to the chance that the same GUID
has already been generated, and the whole point of GUIDs is that this
probability is so small that it doesn't matter. So no, it's not very
possible, it's extremely unlikely. While early GUID generation algorithms
were limited in the amount of unique GUIDs they could generate per second,
this is no longer a problem (and the number of files you generate per second
should be far more limited anyway).
Calling GetTempFileName (on the path object, or through the API) will
create the file, guaranteeing that there will not be a conflict with a
name that already exists.
GetTempFileName() is considerably *worse* in this regard than using a GUID,
as it can only generate 65,535 unique files for any given prefix (and its
performance drops the more there are). The MSDN itself actually recommends
switching to GUIDs for this case (http://msdn.microsoft.com/library/aa364991).

Note that neither of these methods is acceptable if the file creation has to
be robust in the face of attackers: someone who knows you're using GUIDs or
GetTempFileName() (or any other systematic way of generating file names)
could predict your next name and either pre-create or repossess a file. See
for example http://dotnet.org.za/markn/archive/2006/04/15/51594.aspx.

Secure applications aside, though, for normal uses a GUID is fine.
 
Jeroen,

Unfortunately, you are making the assumption that the problem exists
with the set of values that one is selecting from for the name of the file.
This isn't where the problem lies.

The problem is that the statement that Peter suggested:

string unique = Path.Combine(yourPath, Guid.NewGuid().ToString

Doesn't take into account the files that ^already exist in the
directory^. Because of this, it is perfectly reasonable to assume that it
could be the case that a file with the same GUID already exists, because the
OP didn't indicate that no other process but his would be writing to the
directory (which isn't going to be the case, because if the user can write
to it, any process run by that user can run to it) and that if there were,
that they would abide by the same convention for generating names of files
in the directory (by using new GUIDs).

The difference between that and GetTempFileName is that GetTempFileName
will ^guarantee^ a unique file name in the directory upon a successful call
(and it does so by having the file be created upon return of the call).

Granted, one can work around this by a call to CreateFile to create the
file (you could do it with .NET if you want to catch the exceptions instead
of checking return codes through API methods), but unless you have a
specific need for more than 65,535 unique temp file names, why write
something that has already been done for you?

--
- Nicholas Paldino [.NET/C# MVP]
- (e-mail address removed)

Jeroen Mostert said:
Nicholas Paldino [.NET/C# MVP] wrote:
[on generating a temporary file name with a GUID]
Unfortunately, that's not going to work, as it's very possible that
the file already exists.

The chance the file already exists is equal to the chance that the same
GUID has already been generated, and the whole point of GUIDs is that this
probability is so small that it doesn't matter. So no, it's not very
possible, it's extremely unlikely. While early GUID generation algorithms
were limited in the amount of unique GUIDs they could generate per second,
this is no longer a problem (and the number of files you generate per
second should be far more limited anyway).
Calling GetTempFileName (on the path object, or through the API) will
create the file, guaranteeing that there will not be a conflict with a
name that already exists.
GetTempFileName() is considerably *worse* in this regard than using a
GUID, as it can only generate 65,535 unique files for any given prefix
(and its performance drops the more there are). The MSDN itself actually
recommends switching to GUIDs for this case
(http://msdn.microsoft.com/library/aa364991).

Note that neither of these methods is acceptable if the file creation has
to be robust in the face of attackers: someone who knows you're using
GUIDs or GetTempFileName() (or any other systematic way of generating file
names) could predict your next name and either pre-create or repossess a
file. See for example
http://dotnet.org.za/markn/archive/2006/04/15/51594.aspx.

Secure applications aside, though, for normal uses a GUID is fine.
 
Nicholas said:
Unfortunately, you are making the assumption that the problem exists
with the set of values that one is selecting from for the name of the file.
This isn't where the problem lies.

The problem is that the statement that Peter suggested:

string unique = Path.Combine(yourPath, Guid.NewGuid().ToString

Doesn't take into account the files that ^already exist in the
directory^.

Because it doesn't have to.
Because of this, it is perfectly reasonable to assume that it
could be the case that a file with the same GUID already exists, because the
OP didn't indicate that no other process but his would be writing to the
directory (which isn't going to be the case, because if the user can write
to it, any process run by that user can run to it) and that if there were,
that they would abide by the same convention for generating names of files
in the directory (by using new GUIDs).
And if they did, it still wouldn't matter one bit, unless you think
processes are likely to generate identical GUIDs. They're not. This is what
being unique in space and time is all about. It's not magic, it's just a lot
of bits.

It makes no difference if one process is generating thousands of files, or
thousands of processes are generating thousands of files. The probability of
a clash is still negligible. And I mean NEGLIGIBLE negligible, not "pretty
unlikely". GUIDs generated on one machine are unique by use of a value based
on a time stamp and a uniquifier. By the time your directory is full enough
to make a GUID clash remotely likely, your file system has long since given
up. It's literally not something you worry about.
The difference between that and GetTempFileName is that GetTempFileName
will ^guarantee^ a unique file name in the directory upon a successful call
(and it does so by having the file be created upon return of the call).
The difference is this: using a GUID guarantees a unique file name by virtue
of GUIDs being unique. Using GetTempFileName() guarantees a unique file name
by virtue of repeatedly generating a file name and creating a file by that
name until it succeeds (hopefully on the first try). Both "work" in this
respect.
Granted, one can work around this by a call to CreateFile to create the
file

The point is that you don't need to, unless you are concerned about people
deliberately producing files with clashing names. And if that *is* your
concern, GetTempFileName() is of no use either, as I've explained upthread.

Note also that GetTempFileName() is restricted to 65,535 file names *per
user*. There's only one temp directory. Once it's "full" by these standards,
GetTempFileName() will simply fail. Now, I've never seen that happen on a
production system, but if you want to argue that existing files are a
problem, GetTempFileName() would become a problem long before GUID file
names would. In other words: neither has a problem.
 
Jeroen said:
Nicholas Paldino [.NET/C# MVP] wrote:
[about using GUIDs for file names not being safe]
The point is that you don't need to, unless you are concerned about
people deliberately producing files with clashing names. And if that
*is* your concern, GetTempFileName() is of no use either, as I've
explained upthread.
Whoops, snipped the part about catching exceptions, which is necessary to
understand the rest. Sorry about that.
 
See inline.

Jeroen Mostert said:
Because it doesn't have to.

Actually, it does, because he is trying to get the same functionality as
GetTempFileName. GetTempFileName will create a unique file in the directory
(up to the limit of 65,535 for the directory it is created in). The call
above doesn't create the file.

I'm not contesting the highly unlikely possibility that a new GUID would
conflict with a filename. The point is, it most definitely can-happen when
a 128 bit value is generated by some method other than the one prescribed
for creating GUIDs.

It's not the case where what is in the directory is constrained by a new
GUID, it's when it is not constrained which is the problem. The OP didn't
state anything about what would be in the directory, so I didn't assume it.
I agree that if all you used for the contents of the directory is a new
GUID, then that would be fine (not perfect, but fine).
And if they did, it still wouldn't matter one bit, unless you think
processes are likely to generate identical GUIDs. They're not. This is
what being unique in space and time is all about. It's not magic, it's
just a lot of bits.

No, processes are not likely to generate identical GUIDs, but they can
definitely produce identical 128-bit values, which a GUID happens to be.
It's the method of creation of the GUID that helps to guarantee the
uniqueness of the value, not just the size of the data.

If you don't follow that method though for creating the 128 bit value,
then the probability of a unique value diminishes. There is nothing
stopping a process from using a 128 bit counter starting at 0 (or some other
unsigned 128 bit value) and incrementing up or down. A collision could
easily happen in that manner.

Again though, if there is a constraint that the names of the files that
are in the directory are limited to what is produced by creating new GUIDs
then it would most definitely be fine to use that.
It makes no difference if one process is generating thousands of files, or
thousands of processes are generating thousands of files. The probability
of a clash is still negligible. And I mean NEGLIGIBLE negligible, not
"pretty unlikely". GUIDs generated on one machine are unique by use of a
value based on a time stamp and a uniquifier. By the time your directory
is full enough to make a GUID clash remotely likely, your file system has
long since given up. It's literally not something you worry about.

Well, as stated above, it is if you don't have that constraint, which is
the point I was trying to make all along. There are many methods for
generating 128-bit values. If everyone doesn't play by the same rules, this
can lead to a conflict.
The difference is this: using a GUID guarantees a unique file name by
virtue of GUIDs being unique. Using GetTempFileName() guarantees a unique
file name by virtue of repeatedly generating a file name and creating a
file by that name until it succeeds (hopefully on the first try). Both
"work" in this respect.

Well, they do, but only if you code it that way. If one just assumes
that a GUID will always produce a unique value when compared to other
methods of comparing 128-bit values, then it's going to fail at some point.
So to counter that, you create the unique file name over and over until it
succeeds. Which is exactly what GetTempFileName does. It's at that point
that I ask "why reinvent the wheel if that is not required". If the need is
to generate more than 65,535 temp file names without cycling prefixes (I am
referring to the Win32 API function, not the .NET static method), then yes,
another method is needed.
The point is that you don't need to, unless you are concerned about people
deliberately producing files with clashing names. And if that *is* your
concern, GetTempFileName() is of no use either, as I've explained
upthread.

Note also that GetTempFileName() is restricted to 65,535 file names *per
user*. There's only one temp directory. Once it's "full" by these
standards, GetTempFileName() will simply fail. Now, I've never seen that
happen on a production system, but if you want to argue that existing
files are a problem, GetTempFileName() would become a problem long before
GUID file names would. In other words: neither has a problem.

My original response recommended to use the Win32 API version of
GetTempFileName, not the .NET version, so the limit is not per user, but per
directory, assuming a fixed prefix.

In the end, without checking, using a new GUID is not guaranteed to work
upon a successful call (without trying to create the temp file to guarantee
the name, and cycling if it exists), although the probability is
astronimically high that it will, versus a successful call to
GetTempFileName, where the result is always guaranteed. For EITHER method,
trying to create the file and cycling to another name is required in order
to guarantee it would work all the time (when the call returns success, that
is).
 
Nicholas said:
See inline.
I always do. :-)
Actually, it does, because he is trying to get the same functionality
as GetTempFileName. GetTempFileName will create a unique file in the
directory (up to the limit of 65,535 for the directory it is created
in). The call above doesn't create the file.
Well, imagine the call to do that to be inserted and I'm sure it doesn't
change the arguments. I was not under the impression the GUID generation
somehow magically reserved the file name, as my responses hopefully made clear.
I'm not contesting the highly unlikely possibility that a new GUID
would conflict with a filename. The point is, it most definitely
can-happen when a 128 bit value is generated by some method other than
the one prescribed for creating GUIDs.
Oh come on. Now you want to discuss the scenario where a totally unrelated
application is using 128-bit values formatted as hexstrings of the form
"xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" that are not GUIDs, in the same
directory as yours?! I get the strong feeling you're moving the goalposts to
wherever you're right, regardless of how far we have to move them.

But fair is fair, I'm going to concede the point here that you just *might*
have a problem in a scenario like this. I'm even going to pretend I can see
this happening without malicious intent. That said, I'm pretty sure I can
come up with some equally improbable set of circumstances where
GetTempFileName() doesn't help you. How about this one: some idiotic
application generates 65,535 file names of exactly the form
GetTempFileName() produces before you come in. Damn the rotten luck! This
highly specific but far from impossible scenario reveals a critical flaw!
My original response recommended to use the Win32 API version of
GetTempFileName, not the .NET version, so the limit is not per user, but
per directory, assuming a fixed prefix.
Yes, this is a good point. I was thinking only of Path.GetTempFileName()
because P/Invoking is not the first tool in my box, but obviously nothing is
stopping you from using that. Well, almost nothing -- we'll assume you're
not hosting it in a high security environment or porting the code to Mono or
something nit-pickingly specific like that... I'm sorry, I'll stop now. I
have to go to bed anyway. But if anyone asks, you started it!
In the end, without checking, using a new GUID is not guaranteed to
work upon a successful call (without trying to create the temp file to
guarantee the name, and cycling if it exists), although the probability
is astronimically high that it will, versus a successful call to
GetTempFileName, where the result is always guaranteed.

The operative word here being "successful". We're both conceding that either
side may fail, now we're just quibbling over the relative possibility and
preferred mode of failure.

Yes, the GUID method with extra no checking only works if you're sure to the
extreme the generated values cannot clash -- my point is just that that's
the majority of cases. The "GUID with cycling" approach is a straw construct
that nobody should consider -- if GUIDs can't help you get uniqueness
without checking, the good Lord couldn't help you.

I think we've just about argued each other to a standstill here, from which
we can both walk away without losing too much face. So let's by all means do
that. :-)
 
See inline.

Jeroen Mostert said:
Nicholas said:
See inline.
I always do. :-)
Jeroen Mostert said:
Nicholas Paldino [.NET/C# MVP] wrote:
Unfortunately, you are making the assumption that the problem
exists with the set of values that one is selecting from for the name
of the file. This isn't where the problem lies.

The problem is that the statement that Peter suggested:

string unique = Path.Combine(yourPath, Guid.NewGuid().ToString

Doesn't take into account the files that ^already exist in the
directory^.

Because it doesn't have to.

Actually, it does, because he is trying to get the same functionality
as GetTempFileName. GetTempFileName will create a unique file in the
directory (up to the limit of 65,535 for the directory it is created in).
The call above doesn't create the file.
Well, imagine the call to do that to be inserted and I'm sure it doesn't
change the arguments. I was not under the impression the GUID generation
somehow magically reserved the file name, as my responses hopefully made
clear.

But that's the point I am trying to make. GetTempFileName (both
versions, .NET and Win32 API) will create the file and guarantee that it is
reserved for you, the GUID method does not, unless you cycle to try and
create the file with that name. This is an important point that the OP
should be aware of, at least.
Oh come on. Now you want to discuss the scenario where a totally unrelated
application is using 128-bit values formatted as hexstrings of the form
"xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" that are not GUIDs, in the same
directory as yours?! I get the strong feeling you're moving the goalposts
to wherever you're right, regardless of how far we have to move them.

But fair is fair, I'm going to concede the point here that you just
*might* have a problem in a scenario like this. I'm even going to pretend
I can see this happening without malicious intent. That said, I'm pretty
sure I can come up with some equally improbable set of circumstances where
GetTempFileName() doesn't help you. How about this one: some idiotic
application generates 65,535 file names of exactly the form
GetTempFileName() produces before you come in. Damn the rotten luck! This
highly specific but far from impossible scenario reveals a critical flaw!

Actually, the difference between the GUID method and the call to
GetTempFileName is that GetTempFileName will fail (return an error code in
Win32 or throw an Exception in .NET) in the event that there are that many
temp files and it can not guarantee the generation of a file with a unique
name.

In the end, it doesn't matter what size the name of the file is, you
can't guarantee that the file is unique in the directory unless you try and
create the file. This will require a call to CreateFile (or a .NET API
which will do that) and a regeneration of the file name if the file already
exists. Just expanding the size of the unique name only decreases your odds
of a conflict, but it doesn't guarantee that the file will be unique.
Yes, this is a good point. I was thinking only of Path.GetTempFileName()
because P/Invoking is not the first tool in my box, but obviously nothing
is stopping you from using that. Well, almost nothing -- we'll assume
you're not hosting it in a high security environment or porting the code
to Mono or something nit-pickingly specific like that... I'm sorry, I'll
stop now. I have to go to bed anyway. But if anyone asks, you started it!
Started?


The operative word here being "successful". We're both conceding that
either side may fail, now we're just quibbling over the relative
possibility and preferred mode of failure.

Yes, the GUID method with extra no checking only works if you're sure to
the extreme the generated values cannot clash -- my point is just that
that's the majority of cases. The "GUID with cycling" approach is a straw
construct that nobody should consider -- if GUIDs can't help you get
uniqueness without checking, the good Lord couldn't help you.

I don't believe that to be true based on my point above. There is a
difference between guarantee and almost guaranteed, no matter how small the
probability is.
I think we've just about argued each other to a standstill here, from
which we can both walk away without losing too much face. So let's by all
means do that. :-)

I don't agree there is a standstill here. Without creating a file, you
have no chance of making a guarantee that a file is unique no matter how
large the filename may be.

I just don't want to see the OP have a method that simply returns the name
of the file and then have an exception creep up in his code one day because
of something that could have been prevented now. If something can be locked
down before code is deployed, then it should be. This is one of those
cases.

With all due respect, this isn't about losing face. It's simply a
debate on a topic between two people. There is no face to be won or lost
here, at least from my perspective.
 
Peter Duniho said:
I don't think that Nicholas's concerns are unreasonable. After all, two
instances of the same application is not a terribly unusual scenario, and
could easily lead to the above conditions.

No it couldn't. The key part is "that are *not* GUIDs" (emphasis
added). If they are the same application, then both will be using the
same GUID generator, and there's no conflict.

For there to be a conflict, you have to have one process that creates
files in the given directory using GUIDs and another process that also
creates files in the same directory, using the same format as the
GUIDs, but not using GUIDs.

Even so, unless you're using sequential GUIDs, a conflict is unlikely
(if you are using sequential GUIDs, it's not that unlikely at all, as
the second process can be simply taking an existing file name and
modifying it to create the new filename).
 
[...]
I don't think that Nicholas's concerns are unreasonable.  After all,two
instances of the same application is not a terribly unusual scenario,  
and
could easily lead to the above conditions.
No it couldn't.  The key part is "that are *not* GUIDs" (emphasis
added).  If they are the same application, then both will be using the
same GUID generator, and there's no conflict.

I encourage you to post a link to some documentation that guarantees that 
when using the Guid class, the GUIDs returned are guaranteed to be unique..

Hint: the .NET documentation specifically _denies_ this guarantee.
That's overblown. What the documentation says, literally, in one
sentence, is "such an identifier has a very low probability of being
duplicated." And this is true. A very, very low probability of being
duplicated. A preposterously low probability of being duplicated. A
"you'll win the lottery five times in a row" order of probability.

Of course there's no absolute guarantee they're unique, that's
impossible! There's only 128 bits, far short of the infinity required
for guaranteed uniqueness. What people don't seem to appreciate is
that you could generate millions of GUIDs per minute and still not
have any significant chance of collision.

A GUID is *not* a random number, it's bound to time and space. This is
exactly where the uniqueness comes from.
No.  You can simply have two processes using the exact same algorithm at  
the exact same time.
I should *hope* they're using the same algorithm; any decent algorithm
would be designed to minimize collisions with instances of itself.
It is a mistake to assume that GUIDs are unique.  In spite of the name, 
there is not necessarily a guarantee that they are.
It is a mistake to assume that GUIDs are not unique when they are
properly used (that is, generated according to the documented
algorithms, not made up by the programmer like they often are in COM).
They were specifically designed to provide uniqueness. Arguing that
GUIDs fail at their primary purpose is quite a claim.

Programmers seem to have an unusually hard time coming to grips with
anything that's not binary certainty (i.e. absolute). For any
practical purpose you'd care to name (including temp filenames) an
infinitesimally small chance is as good as zero. Not as in the same --
just as good.
 
Jeroen Mostert said:
Nicholas said:
Nicholas Paldino [.NET/C# MVP] wrote:
    Unfortunately, you are making the assumption that the problem
exists with the set of values that one is selecting from for the name
of the file. This isn't where the problem lies.
    The problem is that the statement that Peter suggested:
string unique = Path.Combine(yourPath, Guid.NewGuid().ToString
    Doesn't take into account the files that ^already exist in the
directory^.
Because it doesn't have to.
   Actually, it does, because he is trying to get the same functionality
as GetTempFileName.  GetTempFileName will create a unique file in the
directory (up to the limit of 65,535 for the directory it is created in).
The call above doesn't create the file.
Well, imagine the call to do that to be inserted and I'm sure it doesn't
change the arguments. I was not under the impression the GUID generation
somehow magically reserved the file name, as my responses hopefully made
clear.

    But that's the point I am trying to make.  GetTempFileName (both
versions, .NET and Win32 API) will create the file and guarantee that it is
reserved for you, the GUID method does not, unless you cycle to try and
create the file with that name.  This is an important point that the OP
should be aware of, at least.
I've agreed that it's a point. I still don't agree it's an *important*
point, for reasons detailed below.
    In the end, it doesn't matter what size the name of the file is, you
can't guarantee that the file is unique in the directory unless you try and
create the file.  This will require a call to CreateFile (or a .NET API
which will do that) and a regeneration of the file name if the file already
exists.  Just expanding the size of the unique name only decreases yourodds
of a conflict, but it doesn't guarantee that the file will be unique.
Yes. In turn, the point on my end that doesn't seem to be appreciated
is that "the odds" we're talking about are not just decreased, they're
reduced to something you could stake your life on. Well, almost --
we're still talking computer software, of course, and there's no end
to programmer deviousness. Your use of the word "only" is misplaced,
and the length of the name is not the argument -- the algorithm
generating it is.
    I don't believe that to be true based on my point above.  Thereis a
difference between guarantee and almost guaranteed, no matter how small the
probability is.
Well, here's were we part ways, because I believe there's no
difference that matters. I therefore disagree with your "no matter how
small" assessment. In absolute terms you're right -- there is,
obviously, a quantifiable difference, but not one that matters. We're
talking real-world execution here, not the world were programs run
literally forever, memory is infinite, all statements on program
execution must be definitive, etc.

Many programmers are uncomfortable with probability, I'm one of them,
but in this case I think I've got a good view of the odds. This is not
one of those "well I *think* this will never fail, so why worry about
it" things that are the precursors to so many bugs.
    I don't agree there is a standstill here.  Without creating a file, you
have no chance of making a guarantee that a file is unique no matter how
large the filename may be.
Yes, this is the point you have been pressing. My point in turn is
that you can favor a better chance of the operation not failing in the
first place over favoring this guarantee to be absolute. I'm not
saying I do, by the way -- for practical purposes I think these
approaches would be equal.
  I just don't want to see the OP have a method that simply returns thename
of the file and then have an exception creep up in his code one day because
of something that could have been prevented now.  If something can be locked
down before code is deployed, then it should be.  This is one of those
cases.
This "locking down" argument is spurious. Creation of the file can
*always* fail, for reasons that have nothing to do with the file
already existing. You cannot leave out error handling regardless of
what method you use, and no method will guarantee a way for you to
always open a temp file without problems. The exact odds of failure of
GUIDs versus GetTempFileName() vary, but it's not true that
GetTempFileName()'s are strictly no worse.
    With all due respect, this isn't about losing face.  It's simply a
debate on a topic between two people.  There is no face to be won or lost
here, at least from my perspective.
Well, that's just like, your opinion, man. I happen to derive my
entire self-worth from having arguments on the Internet!

I'm kidding -- I think.
 
[...]
   I'm not contesting the highly unlikely possibility that a new GUID  
would conflict with a filename.  The point is, it most definitely  
can-happen when a 128 bit value is generated by some method other than 
the one prescribed for creating GUIDs.
Oh come on. Now you want to discuss the scenario where a totally  
unrelated application is using 128-bit values formatted as hexstrings of  
the form "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" that are not GUIDs, in  
the same directory as yours?! I get the strong feeling you're moving the  
goalposts to wherever you're right, regardless of how far we have to  
move them.

I don't think that Nicholas's concerns are unreasonable.  After all, two  
instances of the same application is not a terribly unusual scenario, and 
could easily lead to the above conditions.
The applications being the same is actually the best case, for reasons
you yourself mention: sharing the GUID generation algorithm gives a
better chance of avoiding collisions.
Any solution that is intended to _reliably_ return the name of a file (or 
even better, a reference to a FileStream for the opened file) that is  
guaranteed to be unique and dedicated to the caller of the solution is  
going to have to actually create the file and make sure it was actually  
newly created.  Using a GUID doesn't change that.
This is true, and people apparently think this is the point I'm not
willing to concede.
[...]
   In the end, without checking, using a new GUID is not guaranteed to  
work upon a successful call (without trying to create the temp file to 
guarantee the name, and cycling if it exists), although the probability  
is astronimically high that it will, versus a successful call to  
GetTempFileName, where the result is always guaranteed.
The operative word here being "successful". We're both conceding that  
either side may fail, now we're just quibbling over the relative  
possibility and preferred mode of failure.

I don't think so.  Nicholas is saying that the method will have to check  
to make sure there was no collision and retry the operation until there  
wasn't one.  You seem to be suggesting the method should not bother to  
check at all, and in fact one could just generate and use a GUID directly,  
rather than wrapping it in a method with functionality similar to  
GetTempFileName().
No, no, of course not! Creating a file can always fail, and you can
never afford not to check (unless you don't mind your program
terminating immediately, which could be the case). What I *do* argue
is that if it *does* fail, it won't be the GUID's fault.
That's simply not true.  Unless you have all processes generating GUIDs 
 from the same source, a source that itself guarantees no collisions,  
uniqueness is not guaranteed.

If GUIDs were a free-for-all, there would be problems. GUID generators
are constrained to algorithms that minimize collisions. Also, if we're
specifically talking about Win32 processes (including .NET processes):
yes, they do all generate GUIDs from the same source.
 And since the Guid structure (the most  
obvious source of a GUID and the one being proposed here) does not  
guarantee uniqueness, then so too uniqueness is not guaranteed for the  
filename.
I've addressed this one downthread. We're not shooting for eternal
guarantees here.
Furthermore, you need not invoke any diety to help you avoid collisions.  
You simply need to make sure you verify uniqueness before actually using  
the GUID as the filename, and keep getting new GUIDs until that uniqueness  
is confirmed.

As I've mentioned above, I wouldn't bother. You have to check for
failure, but cycling through GUIDs is pointless as far as increasing
the odds of success go.
 Yes, in the vast majority of cases, you'll only ever have  
to generate a single GUID.  But you can't make any guarantees unless you  
make the check, and correct programming is all about the _guarantees_ you 
can make, not wishful thinking.
This is the common programmer attitude of "events with extremely low
probability are bunk, only absolutes matter". This is a valuable
attitude in most cases, because programmers deluding themselves as to
the safety of their code is an important problem. My argument is that
in this case, the odds are solid.
Why does it have to be about losing face or not losing face?  Why can'twe  
just have a technical discussion, where the goal is to arrive at the  
correct technical conclusion?
Eh, that works too, I suppose. But this is Usenet, you can't expect
people to be rational all the time. Think of how boring that would
get!
Being wrong is no reason to "lose face".  Everyone makes mistakes from  
time to time, and being willing to admit that and move on is the sign of  
someone who never needs to feel like they've "lost face".
That's easy enough if you've made an obvious, overt clunker of a
mistake, but usually it boils down to shades of difference and
agreeing to disagree and all that. There's also the fact that fatigue
occurs long before you've worked out what is absolutely, without
reservations, the correct technical conclusion from all points of
view.

The problem is usually not being wrong, the problem is being not
absolutely right. Getting to the bottom of that takes a lot longer.
 
[...]
That's overblown. What the documentation says, literally, in one
sentence, is "such an identifier has a very low probability of being
duplicated." And this is true.

Thank you for admitting your error.
You seem to think I'm making the argument that GUIDs are absolutely
unique with 100% probability. If that's how it appeared, I did make a
mistake, because that was certainly not my intent.
No.  This is never true.  Writing your code to rely on very low  
probabilities of an error is incorrect.  Code like that can and will  
eventually fail.
OK, now we're at the heart of the argument! I disagree with the
assertion that it's incorrect. I only agree for high values of "low
probability", which are actually the majority. GUIDs are in a
different league.
 
Maybe they should rename it GNUID?

My suggestion doesn't reserve the file by creating it, nor does it check the
file doesn't exist, but these are easy steps to add. I'd still go with the
GUID, even if I were paranoid to add the additional checks.
 
Back
Top