Graph configuration model based evaluation of the education-occupation match

posted Mar 6, 2018, 11:47 AM by János Abonyi

To study education—occupation matchings we developed a bipartite network model of education to work transition and a graph configuration model based metric. We studied the career paths of 15 thousand Hungarian students based on the integrated database of the National Tax Administration, the National Health Insurance Fund, and the higher education information system of the Hungarian Government. A brief analysis of gender pay gap and the spatial distribution of over-education is presented to demonstrate the background of the research and the resulted open dataset. We highlighted the hierarchical and clustered structure of the career paths based on the multi-resolution analysis of the graph modularity. The results of the cluster analysis can support policymakers to fine-tune the fragmented program structure of higher education.

The details of this research are published in PLOS ONE:

All the files and the R code are available at:

Sequence Mining based Alarm Suppression

posted Feb 10, 2018, 12:45 PM by János Abonyi   [ updated Feb 10, 2018, 12:52 PM ]

To provide more insight into the process dynamics and represent the temporal relationships among faults, control actions and process variables we propose of a multi-temporal sequence mining based algorithm. The methodology starts with the generation of frequent temporal patterns of the alarm signals. We transformed the multi-temporal sequences into Bayes classifiers. The obtained association rules can be used to define alarm suppression rules. We analyzed the dataset of a laboratory-scale water treatment testbed to illustrate that multi-temporal sequences are applicable for the description of operation patterns. We extended the benchmark simulator of a vinyl acetate production technology to generate easily reproducible results and stimulate the development of alarm management algorithms. The results of detailed sensitivity analyses confirm the benefits of the application of temporal alarm suppression rules which are reflecting the dynamical behaviour of the process.

The files are the supplementary materials of our paper will be published in IEEE Access, 2018 For the extended simulator of the vinyl acetate production technology and the source codes of the Bayes’ theorem-based evaluation of sequences see:

The MATLAB implementation of the sequence mining algorithm is available at:

Visualization and interpretation of deep learning models

posted Feb 10, 2018, 12:36 PM by János Abonyi   [ updated Feb 10, 2018, 12:49 PM ]

We visualise the LSTM deep learning models by principal component analysis. The similarity of the events in fault isolation can be evaluated based on the linear embedding layer of the network, which maps the input signals into a continuous-valued vector space. The method is demonstrated in a simulated vinyl acetate production technology. The results illustrate that with the application of RNN based sequence learning not only accurate fault classification solutions can be developed, but the visualisation of the model can give useful hints for hazard analysis.

The paper related paper will be published in Journal of Chemometrics soon.

The algorithm was implemented in Python. The related code can be downloaded from our Github repository

3rd International Conference on Internet of Things, Big Data and Security - Janos Abonyi became the member of the PC

posted Jun 7, 2017, 10:09 PM by János Abonyi   [ updated Jun 7, 2017, 10:10 PM ]

The internet of things (IoT) is a platform that allows a network of devices (sensors, smart meters, etc.) to communicate, analyse data and process information collaboratively in the service of individuals or organisations. The IoT network can generate large amounts of data in a variety of formats and using different protocols which can be stored and processed in the cloud. The conference looks to address the issues surrounding IoT devices, their interconnectedness and services they may offer, including efficient, effective and secure analysis of the data IoT produces using machine learning and other advanced techniques, models and tools, and issues of security, privacy and trust that will emerge as IoT technologies mature and become part of our everyday lives.

1 . Big Data Research 
2 . Emerging Services and Analytics 
3 . Internet of Things (IoT) Fundamentals 
4 . Internet of Things (IoT) Applications 
5 . Big Data for Multi-discipline Services 
6 . Security, Privacy and Trust 
7 . IoT Technologies 

Regular Paper Submission: October 16, 2017 
Regular Paper Authors Notification: December 15, 2017 
Regular Paper Camera Ready and Registration: January 4, 2018 

International Conference on Communication, Computing & Internet of Things - (IC3IoT 2018)

posted May 31, 2017, 8:53 PM by János Abonyi

"Acceptance of global competitiveness and unlimited innovations are emerging as the most critical elements in wealth generation in the current world economy. Transitions into a developed nation and empowered society shall not be a far away dream but shall be a near future reality. The need for linking science and technology to the growth of India shall be intensified and improved by conferences of this kind.

Broadband and Wireless Communication have brought massive changes to the world and continue to provide an array of new challenges, multi-domain applications and solutions such as IoT. The aim of IC3IoT is to provide an excellent forum for sharing knowledge and present the innovative researchers, and technologies as well as developments and future demands related to Broadband Technologies, Computing Technologies, Human-Computer Interaction and Wireless Communication along with IoT.

An International conference of this nature will enhance and benefit the human society at large since it will bring together leading researchers engineers and scientists in the domain of interest."

Prof. Abonyi is a member of the program committee of the conference. More details can be found at the website of the event 

!!!!! HAS - UP "Momentum" Complex Systems Research Group !!!!

posted May 19, 2017, 10:11 AM by János Abonyi   [ updated May 19, 2017, 10:13 AM ]

The objective of the Lendület (Momentum) Program of the Hungarian Academy of Science is a dynamic renewal of the research teams of the Academy and participating universities. With the help of this program, we transform and extend the group of Prof. Abonyi into a research group devoted to complex systems.

We will form a new school for rethinking and upgrading systems engineering and data science in the light of the fourth industrial revolution.  The overall goal of the project is the development of new algorithms and open source tools to utilise the data collected by internetworking systems in monitoring, control, optimisation, scheduling, risk management, and product lifecycle management. This goal challenges present-day internet of things technology regarding the development software agent and advanced sensor fusion functionalities.

We believe that algorithms tailored for (1) multivariate time series analysis, (2) software sensors and event analysis, (3) localisation and (4) model mining can result in significant progress in this field. The creative and integrated application of the resulted algorithms can bring in a new perspective to the integrated monitoring and structural analysis of complex systems and the utilisation of open and linked data. The full integration of these four subprojects is primarily important and ensures the strength and uniqueness of this proposal.

The proposed centre, therefore, aims to bring together the best technological expertise in systems-, data-, and network science, and industrial intelligence. As part of its mission, the Group will make the new and integrated solutions available to the research community and industry through its collaborations and training.


Network science and control theory - Our paper in Scientific Reports!

posted Mar 11, 2017, 6:43 AM by János Abonyi

Network theory based controllability and observability analysis have become widely used techniques. We realized that most applications are not related to dynamical systems, and mainly the physical topologies of the systems are analysed without deeper considerations. Here, we draw attention to the importance of dynamics inside and between state variables by adding functional relationship defined edges to the original topology. The resulting networks differ from physical topologies of the systems and describe more accurately the dynamics of the conservation of mass, momentum and energy. We define the typical connection types and highlight how the reinterpreted topologies change the number of the necessary sensors and actuators in benchmark networks widely studied in the literature. Additionally, we offer a workflow for network science-based dynamical system analysis, and we also introduce a method for generating the minimum number of necessary actuator and sensor points in the system.

PhD defense of Laszlo Dobos, 7th of June, 2016, 1pm

posted May 20, 2016, 3:04 PM by János Abonyi   [ updated May 20, 2016, 3:05 PM ]

Development of Experimental Design Techniques for Analyzing and Optimization of Operating Technologies

The aim of this thesis is to introduce theoretical basics of different approaches which can support further the production process development, based on the extracted knowledge from process data. As selection of time-frame with a certain operation is the starting point in a further process investigation, Dynamic Principal Component Analysis (DPCA) based time-series segmentation approach is introduced in this thesis first. This new solution is resulted by integrating DPCA tools into the classical univariate time-series segmentation methodologies. It helps us to detect changes in the linear relationship of process variables, what can be caused by faults or misbehaves. This step can be the first one in the model-based process development since it is possible to neglect the operation ranges, which can ruin the prediction capability of the model. In other point of view, we can highlight problematic operation regimes and focus on finding root causes of them. When fault-free, linear operation segments have been selected, further segregation segregation of data segments is needed to find data slices with high information content in terms of model parameter identification. As tools of Optimal Experiment Design (OED) are appropriate for measuring the information content of process data, the goal oriented integration of OED tools and classical timeseries segmentation can handle the problem. Fisher information matrix is one of the basic tools of OED. It contains the partial derivatives of model output respect to model parameters when considering a particular input data sequence. A new, Fisher information matrix based time-series segmentation methodology has been developed to evaluate the information content of an input data slice. By using this tool, it becomes possible to select potentially the most valuable and informative time-series segments. This leads to the reduction of number of industrial experiments and their costs. In the end of the thesis a novel, economic objective function-oriented framework is introduced for tuning model predictive controllers to be able to exploit all the control potentials and at the meantime considering the physical and chemical limits of process.

Media Cloud

posted May 20, 2016, 2:58 PM by János Abonyi

Media Cloud is a project of the Harvard Berkman Center for Internet & Society and the MIT Center for Civic Media. It is worth to take a look

1-10 of 12