= download_mnist()
data_fp data_fp
PosixPath('data/mnist.pkl.gz')
Adapted from:
download_mnist (path_gz=Path('data/mnist.pkl.gz'))
download_file (url, destination)
Note that \(784 = 28^2\)
This is implemented by PyTorch (along with all the auto-differentiation stuff)
This goes back to the invention of the APL language. It started with a mathematical notation that was later adapted as a programming language in the 1960s by Kenneth Iverson and Adin Falkoff. This was extended with their physics research into Tensor Analysis.
You can try it here.
Defining a tensor (or “arrays,” using their own terminology), a
.
a ⃪ 3 5 6
Multiplying by a scalar
a ⨉ 3
Element-wise division
b ⃪ 7 8 9
a ÷ b
Numpy was influenced by APL, PyTorch was influenced by numpy.
One thing that differs is that scalars are just “1-rank” tensors in numpy, whereas they have special scalars have special semantics in numpy.
There is no way to generate random numbers from a typical computer. You have to look at natural phenomenon for true randomness.
Generally, all we need is pseudo-randomness, like the Wichmann Hill algorithm.
@dataclass
class RNG:
x = None
y = None
z = None
def seed(self, a):
a, x = divmod(a, 30268)
a, y = divmod(a, 30306)
a, z = divmod(a, 30322)
self.x = int(x) + 1
self.y = int(y) + 1
self.z = int(z) + 1
def random(self):
self.x = (171 * self.x) % 30269
self.y = (172 * self.y) % 30307
self.z = (170 * self.z) % 30323
return (self.x / 30268 + self.y / 30306 + self.z / 30322) % 1
rng = RNG()
rng.seed(42)
rng.random(), rng.random(), rng.random()
(0.25421176102342913, 0.4689255225976794, 0.19544471247365425)
It is important to keep in mind the Unix process semantics for random numbers. What happens if we fork the process?
if os.fork():
print(f"parent process: {rng.random():.2f}")
else:
print(f"child process: {rng.random():.2f}")
parent process: 0.86
child process: 0.86
These are the same number! This is because the internals of the RNG are forked. You need to reseed the RNG for each process, like so.