A language model is a probability distribution over sequences of words, that is, a propability is assigned to any sequence of m words.
Because there are infinitely many valid sentences (digital infinity), it is impossible(?, or at least difficult) to assign a non-zero propability to valid sentences that are not present in the training data.