The web of data is growing at a staggering pace. A large number of data sources, APIs, services, and data visualizations are publicly available. Satisfying users- complex information needs by integrating and processing data from disparate sources, however, remains challenging. In recent years, a large stream of research into mashup-based data integration has emerged. These mashups foster combination and reuse of data and services and thereby have the potential for rapid creation of rich web applications. Nonetheless, users lacking technical expertise still face enormous barriers when trying to develop such mashups efficiently and effectively. To address this issue, we introduce an approach to compose mashups that integrate heterogeneous data sources in an automatic, collaborative, and distributed manner. We follow a visual programming paradigm and aim for three guiding principles: openness, connectedness, and reusability. The approach is based on semantic web technologies and the concept of Linked Widgets, i.e., web widgets backed by a semantic model. Linked Widgets are designed to effectively tackle data integration challenges by (i) fostering reusability of data processing tasks, (ii) easing data integration via simple operations, (iii) allowing users to explore relevant data sources with regard to their context, (iv) tackling data heterogeneity, and (v) facilitating automatic data integration. This thesis introduces a new model of semantic, collaborative, and distributed mashups. Following semantic web principles for data integration, these ad-hoc mashup-based data integration applications can simultaneously process and combine heterogeneous data contributed by multiple stakeholders. The data can come from various devices such as sensors, embedded devices, mobile phones, desktops, or web servers. The approach does not require server infrastructure to upload data, but rather allows each stakeholder to keep control over their data and expose only relevant subsets to the collaborative group. Distributed mashups can run persistently in the background and are hence ideal for real-time data monitoring or data streaming use cases.