The mass digitalization of libraries, national archives or museums needs an automated processing of the acquired image data for a further preparation (indexing, word spotting) and improving the access to the content, thus a document analysis. Projects and institutions that are dealing with the digitalization of documents are amongst others the manuscript research center of Graz University (Vestigia), Improving Access to Text (IMPACT), or projects like Google Books of Google Inc. Document preprocessing is one of the most important steps of document image analysis and is defined as noise removal and binarization, thus foreground/background separation. An additional preprocessing step is the skew estimation of documents which can be based on binarized images or on original grayvalue image. Uncorrected documents can affect the performance of Optical Character Recognition (OCR) and segmentation (layout analysis) methods. Document classification can be used for automated indexing in digital libraries by classifying all e.g. "Table of Contents" pages or allows a document retrieval on large document image databases. By classifying document types, a-priori knowledge (position of text boxes) can be incorporated into the document image analysis system, thus facilitating higher-level document analysis. While binarization and skew estimation are defined as classical preprocessing steps, form classification is added as a preprocessing step within this thesis. The research within this thesis deals with this three preprocessing steps for ancient and historical documents with sparsely inscribed Information (printed or written text). Historical documents can be degraded (e.g. faded out ink or noise like background stains) or fragmented due to their storage conditions. The methods are evaluated using state of the art metrics and are compared to methods of current document Image analysis contests regarding binarization and skew estimation.