In this paper, we propose a comprehensive linguistic study aimed at assessing the implicit behaviour of one of the most prominent Neural Language Model (NLM) based on Transformer architectures, BERT (Devlin et al., 2019), when dealing with a particular source of noisy data, namely essays written by L1 Italian learners containing a variety of errors targeting grammar, orthography and lexicon. Differently from previous works, we focus on the pre-training stage and we devise two evaluation tasks aimed at assessing the impact of errors on sentence-level inner representations from two complementary perspectives, i.e. robustness and sensitivity. Our experiments show that BERT’s ability to compute sentence similarity and to correctly encode a set of raw and morpho-syntactic properties of a sentence are differently modulated by the category of errors and that the error hierarchies in terms of robustness and sensitivity change across layer-wise representations.