Serious games are games that contribute to a purpose other than pure entertainment. The serious games industry and research community are growing rapidly. However, research on their effectiveness has long been based on theoretical assumptions about their potential, but has been missing empirical evidence. Studies attempting to evaluate serious games show inconclusive, inconsistent and even contradictory results. This is caused by a lack of standardized methods and guidelines to avoid common mistakes in the evaluation of serious games, and the unquestioned adoption of methodologies from other domains. This thesis attempts to identify issues in common methodologies used in the evaluation of serious games. This includes the identification of mediating variables as well as some peculiarities regarding the evaluation of games. The use of randomized controlled trials - promoted as method of choice in the field - is found to be unnecessary or even inappropriate in many cases. The common use of media comparison designs is discouraged, primarily due to their assumption that instructions in different media can be designed to teach identical content, and therefore the impact of the instructional medium can be isolated and measured. Many evaluations of serious games make assumptions based on misconceptions about games. To develop a common understanding of the domain, a unified perspective is proposed to aid design, application, evaluation and iteration of serious games. The prerequisites for it, namely the understanding of game dynamics, of learning theories and of their possible applications in games, are covered in this work. The proposed new perspective on the design and evaluation of serious games is exemplified on the games EnerCities, Spent, and Portal 2. All of them have been evaluated in instructional contexts in previous studies. In this work, the games are analyzed from a learning theoretical standpoint to explain evaluation results from previous studies.