Recall that "systems" in quantum computing are mathematically represented by Hilbert Spaces: A vector space over the complex numbers with an inner product (among other properties). I will use $\mathcal{H}$ to mean a Hilbert space.
When measuring a composite system over $\mathcal{H}_{A} \otimes \mathcal{H}_{B}$, $\ket{\psi} = \sum_{i, j} \alpha_{ij} \ket{i, j}$, the measurement yields
$$ \ket{i, j} \text{ with probability } |\alpha_{ij}|^2 $$and the state of the system is now $\ket{i, j}$
What happens if we only measure the second half of the system ($\mathcal{H}_{B}$)?
What should we expect (intuitively)?
Let's re-arrange the terms in $\ket{\psi}$: $$ \ket{\psi} = \sum_{i, j} \alpha_{ij} \ket{i, j} = \sum_{j} \left(\sum_{i} \alpha_{ij} \ket{i}\right)\otimes \ket{j} $$
Let $\beta_j = \sum_{i} \alpha_{ij}^2$, then we can re-write the sum as $$ \ket{\psi} = \sum_{j} \beta_j \left(\sum_{i} \frac{\alpha_{ij}}{\beta_{j}}\ket{i}\right) \otimes \ket{j} $$
Now, if we measure system $\mathcal{H}_{B}$, we'll see outcome $\ket{j}$ with probability $$ \beta_j = \sum_{i} |\alpha_{ij}|^2 $$
And the state after seeing $\ket{j}$ is $$ \frac{1}{\beta_{j}}\sum_{i} \alpha_{ij}\ket{i} $$
Imagine Alice and Bob are friends, and Alice has a state $\ket{\psi} = \alpha\ket{0} + \beta\ket{1}$ that she wants to share with Bob. However Bob is really far away, so Alice can't just give Bob her system, and she wants to make sure that Bob gets $\ket{\psi}$ exactly (no rounding). How can she do this?
She could try to send a classical description of $\ket{\psi}$, but that could require an infinite number of bits if $\ket{\psi}$ has amplitudes that use transcendental numbers.
Why is this something special? Quantum states are fragile, Alice couldn't even get a classical description of $\ket{\psi}$ to infinite precision.
Say that Alice and Bob happened to prepare a pair of systems in an EPR pair before Alice wanted to share $\ket{\psi}$ with Bob $$ \ket{\text{EPR}} = \frac{1}{\sqrt{2}}\left(\ket{00} + \ket{11}\right) $$ There's a protocol that Alice and Bob can perform that will give Bob $\ket{\psi}$ and only uses 2 bits of classical communication. It's described by the following circuit:
In the circuit here, we see a number of special gates
In order to analyze the circuit, let's compute the state of the system after every gate.
On Whiteboard
At the beginning, the state is $$ \ket{\psi} \otimes \frac{1}{\sqrt{2}}\left(\ket{00} + \ket{11}\right) = \frac{1}{\sqrt{2}} \left(\alpha\ket{000} + \alpha\ket{011} + \beta\ket{100} + \beta\ket{111}\right) $$
After applying the CNOT gate to the first 2 qubits, we have the state $$ \frac{1}{\sqrt{2}} \left(\alpha\ket{000} + \alpha\ket{011} + \beta\ket{110} + \beta \ket{101}\right) $$
After applying H to the first qubit, we have the state $$ \frac{1}{2} \left(\alpha\left(\ket{0} + \ket{1}\right)\otimes\ket{00} + \alpha\left(\ket{0} + \ket{1}\right)\otimes\ket{11} + \beta\left(\ket{0} - \ket{1}\right)\otimes \ket{10} + \beta\left(\ket{0} - \ket{1}\right) \otimes \ket{01}\right) $$
Regrouping the terms based on the state of the first 2 qubits, we get the following state $$ \frac{1}{2}\left(\ket{00}\otimes \left(\alpha\ket{0} + \beta\ket{1}\right) + \ket{10} \otimes \left(\alpha\ket{0} - \beta\ket{1}\right) + \ket{01} \otimes \left(\alpha\ket{1} + \beta\ket{0}\right) + \ket{11} \otimes \left(\alpha\ket{1} - \beta\ket{0}\right)\right) $$
So, consider the possible outcomes Alice could see when measuring the state in the computational basis.
Thus, Bob is able to correctly recover $\ket{\psi}$ when he recieves the bits that Alice measures and applies the corresponding Pauli operators.
While it may seem like we've cloned $\ket{\psi}$ (because Bob has a copy of it now), it's important to remember that Alice only has the measurement outcomes (she's measured the state), so this doesn't violate the no cloning principal.
The classical communication is necessary to this protocol. If Bob didn't recieve the bits, then his qubit will still "look like" an even mix between $\ket{0}$ and $\ket{1}$ (although we won't formalize what this means this until later in the class).
So, teleportation still only happens at the speed of light (or classical information). There's still no way
The Bell states are special states for a 2 qubit system that represent maximally entanglement. The 4 Bell states are $$ \ket{\Phi^{+}} = \frac{1}{\sqrt{2}}\left(\ket{00} + \ket{11}\right) $$
$$ \ket{\Phi^{-}} = \frac{1}{\sqrt{2}}\left(\ket{00} - \ket{11}\right) $$$$ \ket{\Psi^{+}} = \frac{1}{\sqrt{2}}\left(\ket{01} + \ket{10}\right) $$$$ \ket{\Psi^{-}} = \frac{1}{\sqrt{2}}\left(\ket{01} - \ket{10}\right) $$The measurement that Alice performs on her qubits (CNOT followed by Hadamard and then measuring both qubits) implements a Bell basis measurement (i.e. they are a Projection valued measurement with the 4 bell states as the projectors).