In this thesis, we introduce an approach to web ranking extraction, which is based on the analysis of visual features. Web rankings are ordered lists of web objects that are commonly used on websites. Our system uses visual features to identify these objects based on the fact that web rankings share a strong visual language. Therefore we can efficiently train a classifier with a small set of examples and extract data from websites with a different source code. We also present an extension for Google Chrome to annotate examples of web rankings. The Wres Annotation Extension is available publicly under a permissive Open Source license and relies on W3C standards. Using our annotation tool we build a training set containing examples using commonly used web ranking layouts, namely table, list, simple list, grid and tiling layouts. We also present a model to formally describe web rankings. Semi-supervised machine learning techniques are used to build classifiers for web ranking identification. We use Weka to train and run our classifiers throughout the thesis. Our thesis includes description of supervised and semi-supervised techniques. In our comparison of Machine Learning algorithms we include BayesNet, NaiveBayes, LibSVM, MultilayerPerceptron, SimpleLogistic, IBk, KStar, LWL, DecisionTable, JRip, OneR, PART, ZeroR, DecisionStump, HoeffdingTree, J48, LMT, RandomForest, RandomTree, and REPTree. We evaluate our classifiers and compare the results of different machine learning techniques. In our evaluation we found that our system works best for labels that repeat multiple times and have a distinctive visual representation. With the Random Forest algorithm we could achieve a precision of 0.9427 for repeating labels and a precision of 0.7573 for non-repeating labels. Addtionally we compare our system to two commercial available, proprietary web extraction systems: Diffbot and Import.io. We analyze the performance of both these systems and find that Diffbot is not an appropriate solution to extract web rankings since it can only extract data from 2% of rankings in our dataset. Import.io can extract web rankings from 53% of web pages in a our dataset, with better performance for some types of web rankings and worse for others. We found that our system outperforms Diffbot and Import.io with 71% correctly extracted web rankings by specialising solely on web rankings.