Structural analysis
Measuring factors affecting local loyalty based on a correlation network
Understanding the level of local loyalty is crucial for urban planners, as individuals who exhibit higher levels of loyalty are more likely to adopt a “voice” strategy and act in the interest of their community, while being less likely to relocate. This study aims to develop a methodology for assessing and determining the factors influencing local loyalty levels. It is presumed that different factors contribute to each level of local loyalty. Through the identification of loyalty components and potential drivers, a data-driven approach based on correlation networks was employed to identify critical factors influencing loyalty at varying levels. The methodology was applied in Veszprém, Hungary, the European Capital of Culture in 2023, using a representative survey. The findings reveal that while demographic variables exhibit a weak correlation with loyalty levels, residents living in the city centre tend to show higher loyalty. Factors associated with high local loyalty include well-being, employment opportunities, healthy social relationships, and strong family ties. Conversely, the least loyal group is characterized by weak connections with friends, neighbours, and colleagues, as well as living in unsafe environments.
Post Date: February 2024
Multilayer Network-Based Evaluation of the Efficiency and Resilience of Network Flows
Supply chain optimization and resource allocation are challenging because of the complex dynamics of flows. We can classify these flows based on whether they perform value-added or nonvalue-added activities in our process. The aim of this article is to present a multilayered temporal network-based model for the analysis of network flows in supply chain optimization and resource allocation. Implementation of a multilayered network distinguishes value-added from nonvalue-added resource flows, enabling a comprehensive view of the flow of resources in the system. Incorporating weighted edges representing the probabilities of time-dependent flows identifies the resource needs and excesses at each supply site, addresses optimal transportation challenges for resource reallocation, and assesses the efficiency and robustness of the system by examining the overlaps in network layers. The proposed method offers a significant extension to the toolsets for network flow analysis, which has the potential to improve decision-making processes for organizations dealing with complex resource management problems. The applicability of the proposed method is demonstrated by analyzing the temporal network extracted from taxi cab flows in New York City. With the application of the method, the results indicate that significant reductions in idle times are achievable.
Post Date: 18 July 2024
Network science and explainable AI-based life cycle management of sustainability models
Model-based assessment of the potential impacts of variables on the Sustainable Development Goals (SDGs) can bring great additional information about possible policy intervention points. In the context of sustainability planning, machine learning techniques can provide data-driven solutions throughout the modeling life cycle. In a changing environment, existing models must be continuously reviewed and developed for effective decision support. Thus, we propose to use the Machine Learning Operations (MLOps) life cycle framework. A novel approach for model identification and development is introduced, which involves utilizing the Shapley value to determine the individual direct and indirect contributions of each variable towards the output, as well as network analysis to identify key drivers and support the identification and validation of possible policy intervention points. The applicability of the methods is demonstrated through a case study of the Hungarian water model developed by the Global Green Growth Institute. Based on the model exploration of the case of water efficiency and water stress (in the examined period for the SDG 6.4.1 & 6.4.2) SDG indicators, water reuse and water circularity offer a more effective intervention option than pricing and the use of internal or external renewable water resources.
Post Date: 17 June 2024
Network-based visualisation of frequent sequences
Frequent sequence pattern mining is an excellent tool to discover patterns in event chains. In complex systems, events from parallel processes are present, often without proper labelling. To identify the groups of events related to the subprocess, frequent sequential pattern mining can be applied. Since most algorithms provide too many frequent sequences that make it difficult to interpret the results, it is necessary to post-process the resulting frequent patterns. The available visualisation techniques do not allow easy access to multiple properties that support a faster and better understanding of the event scenarios. To answer this issue, our work proposes an intuitive and interactive solution to support this task, introducing three novel network-based sequence visualisation methods that can reduce the time of information processing from a cognitive perspective. The proposed visualisation methods offer a more information rich and easily understandable interpretation of sequential pattern mining results compared to the usual text-like outcome of pattern mining algorithms. The first uses the confidence values of the transitions to create a weighted network, while the second enriches the adjacency matrix based on the confidence values with similarities of the transitive nodes. The enriched matrix enables a similarity-based Multidimensional Scaling (MDS) projection of the sequences. The third method uses similarity measurement based on the overlap of the occurrences of the supporting events of the sequences. The applicability of the method is presented in an industrial alarm management problem and in the analysis of clickstreams of a website. The method was fully implemented in Python environment. The results show that the proposed methods are highly applicable for the interactive processing of frequent sequences, supporting the exploration of the inner mechanisms of complex systems.
Post Date: 13 May 2024
Hypergraph and network flow-based quality function deployment
Quality function deployment (QFD) has been a widely-acknowledged tool for translating customer requirements into quality product characteristics based on which product development strategies and focus areas are identified. However, the QFD method considers the correlation and effect between development parameters, but it is not directly implemented in the importance ranking of development actions. Therefore, the cross-relationships between development parameters and their impact on customer requirement satisfaction are often neglected. The primary objective of this study is to make decision-making more reliable by improving QFD with methods that optimize the selection of development parameters even under capacity or cost constraints and directly implement cross-relationships between development parameters and support the identification of interactions visually. Therefore, QFD is accessed from two approaches that proved efficient in operations research. 1) QFD is formulated as a network flow problem with two objectives: maximizing the benefits of satisfying customer needs using linear optimization or minimizing the total cost of actions while still meeting customer requirements using assignment of minimum cost flow approach. 2) QFD is represented as a hypergraph, which allows efficient representation of the interactions of the relationship and correlation matrix and the determination of essential factors based on centrality metrics. The applicability of the methods is demonstrated through an application study in developing a sustainable design of customer electronic products and highlights the improvements' contribution to different development strategies, such as linear optimization performed the best in maximizing customer requirements' satisfaction, assignment as minimum cost flow approach minimized the total cost, while the hypergraph-based representation identified the indirect interactions of development parameters and customer requirements.
Post date: 14 December 2022
Cooperation patterns in the ERASMUS student exchange network: an empirical study
The ERASMUS program is the most extensive cooperation network of European higher education institutions. The network involves 90% of European universities and hundreds of thousands of students. The allocated money and number of travelers in the program are growing yearly. By considering the interconnection of institutions, the study asks how the program’s budget performs, whether the program can achieve its expected goals, and how the program contributes to the development of a European identity, interactions among young people from different countries and learning among cultures. Our goal was to review and explore the elements of network structures that can be used to understand the complexity of the whole ERASMUS student mobility network at the institutional level. The results suggest some socioeconomic and individual behavioral factors underpinning the emergence of the network. While the nodes are spatially distributed, geographical distance does not play a role in the network’s structure, although parallel travelling strategies exist, i.e., in terms of preference of short- and long-distance. The European regions of home and host countries also affect the network. One of the most considerable driving forces of edge formation between institutions are the subject areas represented by participating institutions. The study finds that faculties of institutions are connected rather than institutions, and multilayer network model suggested to explore the mechanisms of those connections. The results indicate that the information uncovered by the study is helpful to scholars and policymakers.
Post date: 27 October 2022
Hypergraph-based analysis and design of intelligent collaborative manufacturing space
A method for hypergraph-based analysis and the design of manufacturing systems has been developed. The reason for its development is the need to integrate the human workforce into Industry 4.0 solutions. The proposed intelligent collaborative manufacturing space enhances collaboration between the operators as well as provides them with valuable information about their performance and the state of the production system. The design of these Operator 4.0 solutions requires a problem-specific description of manufacturing systems, the skills, and states of the operators, as well as of the sensors placed in the intelligent space for the simultaneous monitoring of the cooperative work. The design of this intelligent collaborative manufacturing space requires the systematic analysis of the critical sets of interacting elements. The proposal is that hypergraphs can efficiently represent these sets, moreover, studying the centrality and modularity of the resultant hypergraphs can support the formation of collaboration and interaction schemes and the formation of manufacturing cells. A fully reproducible illustrative example presents the applicability of this concept.
Post date: 06 September 2022
Ontology-Based Analysis of Manufacturing Processes: Lessons Learned from the Case Study of Wire Harness Production
Effective information management is critical for the development of manufacturing processes. This paper aims to provide an overview of ontologies that can be utilized in building Industry 4.0 applications. The main contributions of the work are that it highlights ontologies that are suitable for manufacturing management and recommends the multilayer-network-based interpretation and analysis of ontology-based databases. This article not only serves as a reference for engineers and researchers on ontologies but also presents a reproducible industrial case study that describes the ontology-based model of a wire harness assembly manufacturing process.
Frequent Itemset Miniing and Multi-Layer Network-Based Analysis of RDF Databases
Triplestores or resource description framework (RDF) stores are purpose-built databasesused to organise, store and share data with context. Knowledge extraction from a large amountof interconnected data requires effective tools and methods to address the complexity and theunderlying structure of semantic information. We propose a method that generates an interpretablemultilayered network from an RDF database. The method utilises frequent itemset mining (FIM)of the subjects, predicates and the objects of the RDF data, and automatically extracts informativesubsets of the database for the analysis. The results are used to form layers in an analysablemultidimensional network. The methodology enables a consistent, transparent, multi-aspect-orientedknowledge extraction from the linked dataset. To demonstrate the usability and effectiveness ofthe methodology, we analyse how the science of sustainability and climate change are structuredusing the Microsoft Academic Knowledge Graph. In the case study, the FIM forms networks ofdisciplines to reveal the significant interdisciplinary science communities in sustainability and climatechange. The constructed multilayer network then enables an analysis of the significant disciplinesand interdisciplinary scientific areas. To demonstrate the proposed knowledge extraction process, wesearch for interdisciplinary science communities and then measure and rank their multidisciplinaryeffects. The analysis identifies discipline similarities, pinpointing the similarity between atmosphericscience and meteorology as well as between geomorphology and oceanography. The results confirmthat frequent itemset mining provides an informative sampled subsets of RDF databases which canbe simultaneously analysed as layers of a multilayer networ
Analytic Hierarchy Process and Multilayer Network-Based Method for Assembly Line Balancing
Assembly line balancing improves the efficiency of production systems by the optimal assignment of tasks to operators. The optimisation of this assignment requires models that provide information about the activity times, constraints and costs of the assignments. A multilayer network-based representation of the assembly line-balancing problem is proposed, in which the layers of the network represent the skills of the operators, the tools required for their activities and the precedence constraints of their activities. The activity–operator network layer is designed by a multi-objective optimisation algorithm in which the training and equipment costs as well as the precedence of the activities are also taken into account. As these costs are difficult to evaluate, the analytic hierarchy process (AHP) technique is used to quantify the importance of the criteria. The optimisation problem is solved by a multi-level simulated annealing algorithm (SA) that efficiently handles the precedence constraints. The efficiency of the method is demonstrated by a case study from wire harness manufacturing.
Multilayer network based comparative document analysis (MUNCoDA)
The proposed multilayer network-based comparative document analysis (MUNCoDA) method supports the identification of the common points of a set of documents, which deal with the same subject area. As documents are transformed into networks of informative word-pairs, the collection of documents form a multilayer network that allows the comparative evaluation of the texts. The multilayer network can be visualized and analyzed to highlight how the texts are structured. The topics of the documents can be clustered based on the developed similarity measures. By exploring the network centralities, topic importance values can be assigned. The method is fully automated by KNIME preprocessing tools and MATLAB/Octave code.
•Networks can be formed based on informative word pairs of a multiple documents
•The analysis of the proposed multilayer networks provides information for multi-document summarization
•Words and documents can be clustered based on node similarity and edge overlap measures
Focal points for sustainable development strategies—Text mining-based comparative analysis of voluntary national reviews
Countries have to work out and follow tailored strategies for the achievement of their Sustainable Development Goals. At the end of 2018, more than 100 voluntary national reviews were published. The reviews are transformed by text mining algorithms into networks of keywords to identify country-specific thematic areas of the strategies and cluster countries that face similar problems and follow similar development strategies. The analysis of the 75 VNRs has shown that SDG5 (gender equality) is the most discussed goal worldwide, as it is discussed in 77% of the analysed Voluntary National Reviews. The SDG8 (decent work and economic growth) is the second most studied goal, With 76 %, while the SDG1 (no poverty) is the least focused goal, it is mentioned only in 48 % of documents and the SDG10 (reduced inequalities) in 49 %. The results demonstrate that the proposed benchmark tool is capable of highlighting what kind of activities can make significant contributions to achieve sustainable developments.
A multilayer and spatial description of the Erasmus mobility network
The Erasmus Programme is the biggest collaboration network consisting of European Higher Education Institutions (HEIs). The flows of students, teachers and staff form directed and weighted networks that connect institutions, regions and countries. Here, we present a linked and manually verified dataset of this multiplex, multipartite, multi-labelled, spatial network. We enriched the network with institutional socio-economic data from the European Tertiary Education Register (ETER) and the Global Research Identifier Database (GRID). We geocoded the headquarters of institutions and characterised the attractiveness and quality of their environments based on Points of Interest (POI) data. The linked datasets provide relevant information to grasp a more comprehensive understanding of the mobility patterns and attractiveness of the institutions.
Review and structural analysis of system dynamics models in sustainability science
As the complexity of sustainability-related problems increases, it is more and more difficult to understand the related models. Although tremendous models are published recently, their automated structural analysis is still absent. This study provides a methodology to structure and visualise the information content of these models. The novelty of the present approach is the development of a network analysis-based tool for modellers to measure the importance of variables, identify structural modules in the models and measure the complexity of the created model, and thus enabling the comparison of different models. The overview of 130 system dynamics models from the past five years is provided. The typical topics and complexity of these models highlight the need for tools that support the automated structural analysis of sustainability problems. For practising engineers and analysts, nine models from the field of sustainability science, including the World3 model, are studied in details. The results highlight that with the help of the developed method the experts can highlight the most critical variables of sustainability problems (like arable land in the Word 3 model) and can determine how these variables are clustered and interconnected (e.g. the population and fertility are key drivers of global processes). The developed software tools and the resulted networks are all available online.
Data-driven multilayer complex networks of sustainable development goals
This data article presents the formulation of multilayer network for modelling the interconnections among the sustainable development goals (SDGs), targets and includes the correlation based linking of the sustainable development indicators with the available long-term datasets of The World Bank, 2018. The spatial distribution of the time series data allows creating country-specific sustainability assessments. In the related research article “Network Model-Based Analysis of the Goals, Targets and Indicators of Sustainable Development for Strategic Environmental Assessment” the similarities of SDGs for ten regions have been modelled in order to improve the quality of strategic environmental assessments. The datasets of the multilayer networks are available on Mendeley.
Network Model-Based Analysis of the Goals, Targets and Indicators of Sustainable Development for Strategic Environmental Assessment
Strategic environmental assessment is a decision support technique that evaluates policies, plans and programs in addition to identifying the most appropriate interventions in different scenarios. This work develops a network-based model to study interlinked ecological, economic, environmental and social problems to highlight the synergies between policies, plans, and programs in environmental strategic planning. Our primary goal is to propose a methodology for the data-driven verification and extension of expert knowledge concerning the interconnectedness of the sustainable development goals and their related targets. A multilayer network model based on the time-series indicators of the World Bank open data over the last 55 years was assembled. The results illustrate that by providing an objective and data-driven view of the correlated variables of the World Bank, the proposed layered multipartite network model highlights the previously not discussed interconnections, node centrality measures evaluate the importance of the targets, and network community detection algorithms reveal their strongly connected groups. The results confirm that the proposed methodology can serve as a data-driven decision support tool for the preparation and monitoring of long-term environmental policies. The developed new data-driven network model enables multi-level analysis of the sustainability (goals, targets, indicators) and will make it possible to plan long-term environmental strategic planning. Through relationships among indicators, relationships among targets and goals can be modelled. The results show that sustainable development goals are strongly interconnected, while the 5th goal (gender equality) is linked mostly to 17th (partnerships for the goals) goal. The analysis has also highlighted the importance of the 4th (quality education).
Frequent pattern mining in multidimensional organizational networks
Network analysis can be applied to understand organizations based on patterns of communication, knowledge flows, trust, and the proximity of employees. A multidimensional organizational network was designed, and association rule mining of the edge labels applied to reveal how relationships, motivations, and perceptions determine each other in different scopes of activities and types of organizations. Frequent itemset-based similarity analysis of the nodes provides the opportunity to characterize typical roles in organizations and clusters of co-workers. A survey was designed to define 15 layers of the organizational network and demonstrate the applicability of the method in three companies. The novelty of our approach resides in the evaluation of people in organizations as frequent multidimensional patterns of multilayer networks. The results illustrate that the overlapping edges of the proposed multilayer network can be used to highlight the motivation and managerial capabilities of the leaders and to find similarly perceived key persons.
The Settlement Structure Is Reflected in Personal Investments: Distance-Dependent Network Modularity-Based Measurement of Regional Attractiveness
How are ownership relationships distributed in the geographical space? Is physical proximity a significant factor in investment decisions? What is the impact of the capital city? How can the structure of investment patterns characterize the attractiveness and development of economic regions? To explore these issues, we analyze the network of company ownership in Hungary and determine how are connections are distributed in geographical space. Based on the calculation of the internal and external linking probabilities, we propose several measures to evaluate the attractiveness of towns and geographic regions. Community detection based on several null models indicates that modules of the network coincide with administrative regions, in which Budapest is the absolute centre, and where county centres function as hubs. Gravity model-based modularity analysis highlights that, besides the strong attraction of Budapest, geographical distance has a significant influence over the frequency of connections and the target nodes play the most significant role in link formation, which confirms that the analysis of the directed company-ownership network gives a good indication of regional attractiveness.
Evaluating the Interconnectedness of the Sustainable Development Goals Based on the Causality Analysis of Sustainability Indicators
Policymaking requires an in-depth understanding of the cause-and-effect relationships between the sustainable development goals. However, due to the complex nature of socio-economic and environmental systems, this is still a challenging task. In the present article, the interconnectedness of the United Nations (UN) sustainability goals is measured using the Granger causality analysis of their indicators. The applicability of the causality analysis is validated through the predictions of the World3 model. The causal relationships are represented as a network of sustainability indicators providing the opportunity for the application of network analysis techniques. Based on the analysis of 801 UN indicator types in 283 geographical regions, approximately 4000 causal relationships were identified and the most important global connections were represented in a causal loop network. The results highlight the drastic deficiency of the analysed datasets, the strong interconnectedness of the sustainability targets and the applicability of the extracted causal loop network. The analysis of the causal loop networks emphasised the problems of poverty, proper sanitation and economic support in sustainable development.
Automated Analysis of the Interactions Between Sustainable Development Goals Extracted from Models and Texts of Sustainability Science
The design and monitoring of sustainable policies should rely on models that can handle complex and interconnected variables and subsystems of sustainability issues. Structuring knowledge has been identified as an essential first step in building models of sustainability science. Although it is known that all models yield a reduced view of the examined topic and no models can include all the variables that would make the representation closed and comprehensive, in the case of sustainability issues it is critical to synthesize as many critical aspects as possible that could have an impact on the studied problem. The key idea of our research is that strategic plans, sustainability reports and scientific studies reflect these variables, therefore, with the tools of text mining, the most important focus points and interactions can be determined. These key aspects and their connections can be represented by a network structure and compared to the subsystems of the dynamic models of sustainability to explore the deficiencies of the models or the lack of focus of the related policies and documentations. In the present work, the proposed methodology through the analysis of five strategical documents is demonstrated and the determined aspects with the structure of the famous World3 system dynamics model compared. The comparison highlighted the incomplete view of the original World3 model since certain topics were not critical issues whilst the World3 model was in development.
Graph configuration model based evaluation of the education-occupation match
To study education—occupation matchings we developed a bipartite network model of education to work transition and a graph configuration model based metric. We studied the career paths of 15 thousand Hungarian students based on the integrated database of the National Tax Administration, the National Health Insurance Fund, and the higher education information system of the Hungarian Government. A brief analysis of gender pay gap and the spatial distribution of over-education is presented to demonstrate the background of the research and the resulted open dataset. We highlighted the hierarchical and clustered structure of the career paths based on the multi-resolution analysis of the graph modularity. The results of the cluster analysis can support policymakers to fine-tune the fragmented program structure of higher education.
All the files and the R code are available at: https://github.com/abonyilab/Edu_Mine_Graph
Multilayer Network-Based Production Flow Analysis
A multilayer network model for the exploratory analysis of production technologies is proposed. To represent the relationship between products, parts, machines, resources, operators, and skills, standardized production and product-relevant data are transformed into a set of bi- and multipartite networks. This representation is beneficial in production flow analysis (PFA) that is used to identify improvement opportunities by grouping similar groups of products, components, and machines. It is demonstrated that the goal-oriented mapping and modularity-based clustering of multilayer networks can serve as a readily applicable and interpretable decision support tool for PFA, and the analysis of the degrees and correlations of a node can identify critically important skills and resources. The applicability of the proposed methodology is demonstrated by a well-documented benchmark problem of a wire-harness production process. The results confirm that the proposed multilayer network can support the standardized integration of production-relevant data and exploratory analysis of strongly interconnected production systems.
Scalable co-Clustering using a Crossing Minimization ‒ Application to Production Flow Analysis
Production flow analysis includes various families of components and groups of machines. Machine-part cell formation means the optimal design of manufacturing cells consisting of similar machines producing similar products from a similar set of components. Most of the algorithms reorders of the machine-part incidence matrix. We generalize this classical concept to handle more than two elements of the production process (e.g. machine - part - product - resource - operator). The application of this extended concept requires an efficient optimization algorithm for the simultaneous grouping these elements. For this purpose, we propose a novel co-clustering technique based on crossing minimization of layered bipartite graphs. The present method has been implemented as a MATLAB toolbox. The efficiency of the proposed approach and developed tools is demonstrated by realistic case studies. The log-linear scalability of the algorithm is proven theoretically and experimentally.
Node Similarity Based Graph Clustering and Visualization
The basis of the presented methods for the visualization and clustering of graphs is a novel similarity and distance metric, and the matrix describing the similarity of the nodes in the graph. This matrix represents the type of connections between the nodes in the graph in a compact form, thus it provides a very good starting point for both the clustering and visualization algorithms. Hence visualization is done with the MDS (Multidimensional Scaling) dimensionality reduction technique obtaining the spectral decomposition of this matrix, while the partitioning is based on the results of this step generating a hierarchical representation. A detailed example is shown to justify the capability of the described algorithms for clustering and visualization of the link structure of Web sites.