[pycrypto] RandomPool

Sun Sep 6 11:54:42 CST 2009

On Thu, Aug 27, 2009 at 01:59:50PM -0400, Paul Koning wrote:
>Hi...
>
>My first post to this list, though I've used pycrypto for 2-3 years
>now.

Hello and welcome!

>RFC 4086, "Randomness Requirements for Security" is very much worth
>studying for a discussion of this subject.

Agreed.  In addition, it is worth looking at the papers listed in the 
"references" section below.

>Dwayne comments "Also, after looking a bit more at OS-provided random
>generators, I'm starting to think that just returning their output
>might not be such a great idea.  There just doesn't seem to be any
>reason to trust them very far."  I don't know what OS is at issue
>here; it would be good for a general statement like that to be
>accompanied by specific evidence.  For example, I spent quite a while
>looking at the Linux /dev/random code by Ted Ts'o, and my conclusion
>is just the opposite.  It looks strong and well constructed.  So could
>you spell which OS you were talking about, and why you concluded it
>should not be trusted?

I assume you are referring to this post:

     http://lists.dlitz.net/pipermail/pycrypto/2008q3/000002.html

There is no single OS at issue here.  Since PyCrypto runs on more platforms 
than just Linux, I would need to trust the OS-provided random number 
generators on *all* of those platforms before I could feel comfortable 
making Crypto.Random.new an alias for Crypto.Random.OSRNG.new.

Here are some of the reasons why I'm not comforable with doing that:

=== 1. MS Windows (XP and earlier) ===

Practical vulnerabilities to state compromise extension attacks on a 
platform where the likelihood of malicious code being executed at *some* 
point in time is relatively high.

See http://eprint.iacr.org/2007/419 [4].

=== 2. MS Windows (Later than XP) ===

As far as I know, the algorithm that replaced the previous Windows RNG 
remains unpublished.  I see no reason to trust it.

=== 3. Anything that implements Dual_EC_DRBG (from NIST SP800-90) ===

See [5]:
http://en.wikipedia.org/w/index.php?title=Dual_EC_DRBG&oldid=306143447#Controversy 

I don't know if Dual_EC_DRBG is ever used as an OS-provided random number 
generator, but if so, it could be a trivial back-door if its output is 
provided directly to an attacker.  Mixing its output with other sources of 
entropy (or even truncating the output, apparently) should help prevent an  
attacker from reversing the algorithm.

=== 4. Solaris, *BSD, proprietary *nix, Minix, HURD ===

No idea, but if these excerpts from Wikipedia [6] are true, they don't help 
earn my trust:

- "In 2004, Landon Curt Noll tested the FreeBSD 5.2.1 version of 
   /dev/random and found that it was not a cryptographically strong random 
   number generator because its output had multiple uniformity flaws 
   according to the Billion bit test. Similar flaws were found in the Linux 
   2.4.21-20, Solaris 8 patch 108528-18, and Mac OS X 10.3.5 implementations 
   of /dev/random."

- "... as with FreeBSD, AIX implements its own Yarrow-based design which 
   uses considerably fewer entropy sources than the standard /dev/random 
   implementation and stops refilling the pool when it thinks it contains 
   enough entropy."

=== 5. Linux ===

Linux's random.c is probably the most trustworthy of the bunch, but some 
elements of its design and implememntation remain controversial.  Quoting 
the comments in drivers/char/random.c:

| There are three exported interfaces; the first is one designed to
| be used from within the kernel:
|
|      void get_random_bytes(void *buf, int nbytes);
|
| This interface will return the requested number of random bytes,
| and place it in the requested buffer.
|
| The two other interfaces are two character devices /dev/random and
| /dev/urandom.  /dev/random is suitable for use when very high
| quality randomness is desired (for example, for key generation or
| one-time pads), as it will only return a maximum of the number of
| bits of randomness (as estimated by the random number generator)
| contained in the entropy pool.
|
| The /dev/urandom device does not have this limit, and will return
| as many bytes as are requested.  As more and more random bytes are
| requested without giving time for the entropy pool to recharge,
| this will result in random numbers that are merely cryptographically
| strong.  For many applications, however, this is acceptable.

The designers of Linux's /dev/random claim that it provides 
information-theoretic security based on entropy estimates.

As noted by John Viega [2], entropy estimation is problematic:

   "One significant problem is a lack of methodology for deriving reasonable 
   estimates.
   . . .
   "Information theory does provide ways to measure entropy, but they are 
   not practical, because one can only model how much entropy is in data if 
   one has a complete understanding of how the data is produced and all 
   possible channels an attacker may use to measure information about the 
   data.  Considering that there are a broad range of possible threat 
   models, and considering that machines behave deterministically yet are 
   still incredibly complex in practice, one should expect data to tend to 
   be predictable (the only times where significant new entropy can really 
   added be to a system are when the machine receives external input), yet 
   it is incredibly difficult to figure out just how predictable."

However, the comments in drivers/char/random.c make it sound as if using 
/dev/random is the preferred, but /dev/urandom can be used as a last 
resort, since it provides "merely cryptographically strong" output.

In reality, /dev/random is a red herring for two reasons:

   a) It provides information-theoretic security *only* *if* the 
   implementation does not overestimate the information available to an 
   attacker.  As noted above, whether that's even possible is debatable.

   b) Applications need the ability to get cryptographically-strong random 
   numbers in a timely manner.  /dev/random blocks when it runs out of 
   (estimated) entropy, so it's useless in most situations.  Worse still, 
   since the entropy is finite, the more it gets used, the longer 
   applications must wait to have their requests satisfied.  As a result, 
   hardly anything uses /dev/random.

Furthermore, Ted Ts'o posted a criticism that makes the accusation that 
with Fortuna, entropy is "thoughtlessly squandered".  
(http://lkml.indiana.edu/hypermail/linux/kernel/0409.3/0646.html)
However, look at this (again in drivers/char/random.c):

| add_input_randomness() uses the input layer interrupt timing, as well as
| the event type information from the hardware.
|
| add_interrupt_randomness() uses the inter-interrupt timing as random
| inputs to the entropy pool.  Note that not all interrupts are good
| sources of randomness!  For example, the timer interrupts is not a
| good choice, because the periodicity of the interrupts is too
| regular, and hence predictable to an attacker.  Disk interrupts are
| a better measure, since the timing of the disk interrupts are more
| unpredictable.

The recommendation is to avoid sources of randomness that *might* be 
predictable to an attacker.  But disk interrupts might be predictable if 
you're using solid-state drives, and input-layer interrupts might be 
predictable if you're using a wireless keyboard and mouse.  Apparently, a 
conservative user should patch his kernel to remove all of these sources of 
entropy!

Linux's RNG also wastes entropy, simply by disregarding it.  With Fortuna, 
you can use all of these sources of randomness, and it won't matter as long 
as *any* of them remains unpredictable to an attacker.

Further:

| All of these routines try to estimate how many bits of randomness a
| particular randomness source.  They do this by keeping track of the
| first and second order deltas of the event timings.

I would love to see an analysis of how much information could be leaked via 
the world-readable /proc/sys/kernel/random/entropy_avail file, and whether 
that information could be used in an attack (either a real attack, or an 
attack by a computationally unbounded attacker.)

To summarize, while I think Linux's RNG is probably the best OS-provided 
RNG I've seen, there are still several things that make me raise my 
eyebrows:

- The claims about "information-theoretic security", which are both 
   questionable (since they might not even be possible) and irrelevant 
   (since /dev/random is mostly unusable anyway).

- The apparent preference of /dev/random over /dev/urandom, even though the 
   former is nearly unusable in real-world applications.

- Ted Ts'o claims that Fortuna wastes entropy, while ignoring the entropy 
   in thousands of timer interrupts that Linux has to ignore due to the 
   design of his RNG.

- The world-readable /proc/sys/kernel/random/entropy_avail file, which is 
   apparently derived from the input into the entropy pool.

===========

On Thu, Aug 27, 2009 at 01:59:50PM -0400, Paul Koning wrote:
>What is Fortuna?  What makes it good enough that an application-level
>RNG can be safely layered on top of it?

http://en.wikipedia.org/wiki/Fortuna_(PRNG)

Fortuna was designed by Niels Ferguson and Bruce Schneier, and is described 
in their book, /Practical Cryptography/ [7].

Fortuna consists of two parts: a series of 32 SHA256-based entropy pools 
(collectively called the "accumulator") and a PRNG (the "generator") that 
basically just runs AES in counter mode.  The generator is periodically 
re-seeded from the accumulator.

Fortuna is designed to resist state compromise extension attacks without 
any need to estimate entropy, and it works well in situations where you 
have several sources of randomness that may or may not be known or 
controlled by an attacker.

PyCrypto's Crypto.Random implementation currently seeds Fortuna from three 
sources: the OS-provided RNG (abstracted by Crypto.Random.OSRNG), 
time.time(), and time.clock().  One of the latter two is usually a 
high-resolution timer, depending on the platform you're on.

I hope that helps to clear things up.

Cheers,
  - Dwayne

References:

[1] "Cryptanalytic Attacks on Pseudorandom Number Generators" (1998) by 
John Kelsey, Bruce Schneier, David Wagner, and Chris Hall.  
http://www.schneier.com/paper-prngs.html

[2] "Practical Random Number Generation in Software" (2003) by John Viega 
http://www.acsac.org/2003/abstracts/79.html

[3] "Analysis of the Linux Random Number Generator" (2006) by Zvi 
Gutterman, Benny Pinkas, and Tzachy Reinman.
http://www.pinkas.net/PAPERS/gpr06.pdf

[4] "Cryptanalysis of the Random Number Generator of the Windows Operating 
System" (Nov 2007) by Leo Dorrendorf, Zvi Gutterman, and Benny Pinkas 
http://eprint.iacr.org/2007/419

[5] "Dual_EC_DRBG - Controversy" on Wikipedia
http://en.wikipedia.org/w/index.php?title=Dual_EC_DRBG&oldid=306143447#Controversy

[6] "/dev/random" on Wikipedia
http://en.wikipedia.org/w/index.php?title=/dev/random&oldid=300118729

[7] /Practical Cryptography/ (2003) by Niels Ferguson and Bruce Schneier.  
ISBN 0-471-22357-3.  http://www.schneier.com/book-practical.html

-- 
Dwayne C. Litzenberger <dlitz at dlitz.net>
Key-signing key   - 19E1 1FE8 B3CF F273 ED17  4A24 928C EC13 39C2 5CF7