Empirical studies and experiments provide the foundation for empirical research in academia and industry. Typically, planning and executing controlled experiments require a lot of effort on preparation and execution. This thesis introduces the concept of using public coding contests as a vehicle for scientific studies. Coding contests are typically large events where skilled developers are trying to solve problems as fast as possible. By setting up rules that participants have to decide and choose particular SW development techniques (of a small predefined selection) to solve a problem, allows empirical analysis afterwards. The benefits are: researchers obtain data from a large set of subjects for their studies, while saving effort by not having to host the experiment themselves. A contest organizer can profit from it, by getting free advertisement and by polishing the reputation of their contest with a scientific touch. To test our idea, four research issues are stated and discussed in the later sections. The first two discuss the vehicle concept in general, under a scientific point of view. For the last two issues we conduct a proof of concept by embedding an experiment into a real coding contest with hundred of participants and analyze the results, namely, the effects of Pair-Programming (PP) and coding experience on time and defect variables. Pair-Programming is a software development techniques that is an actual topic under investigation in the field of computer science. However, studies about the efficiency of PP have varying results and still need further replication. The methodological approach for this thesis was to start with a literature research about experimentation in SW engineering and Pair-Programming, followed by the conduction of an experiment. The results of the experiment are used to discuss the research issues. As result, the experiment was performed successfully and delivered enough data to discuss the research issues. Pairs needed the same time to finish their tasks than individual programmers which means that the effort (person hours) doubled. In regard to software quality (defects), pairs competed slightly better but not significantly. The target audiences are a) researchers in the field of empirical software engineering and b) organizations using comprehensive recruiting methods or assessments to test their future employees.