Databases
Fast data analysis via in-memory
Fast data analysis is useful in production - for example, to be able to react immediately to problems in production or to a cyber attack. An in-memory database, which works up to 100 times faster than conventional systems, is predestined for this.
The topics of big data and data analysis are much more than the much-cited hype, big data is a reality: within just 18 to 24 months, the average amount of terabytes of data produced in companies doubles. The business intelligence (BI) context is playing an increasingly important role here. After all, those who collect big data but do not use it for their own business success will not be able to keep up with the competition in the long term. This is particularly true for industry.
Increasing product diversity, digitalization and the Internet of Things, as well as rising customer requirements, are driving up the volume of data during industrial production enormously - a major challenge for companies. Evaluating these volumes of data in a targeted manner and recognizing and presenting the correlations are essential for a company's success these days. Industry is also increasingly recognizing the potential and is paving the way to the smart factory thanks to comprehensive analysis options. In this way, manufacturing companies can gain important insights from data analyses. This is because they form a sound basis for essential strategic decisions. They also ensure that production processes are significantly optimized.
Fast data analysis
In order to benefit from the advantages of data analysis, company, product or customer data must be analyzed for specific applications in the sub-second range. So-called in-memory databases are the ideal solution for such complex and fast analyses.
Real-time data analyses carried out with in-memory solutions raise production processes to a new level of efficiency. In the context of production, for example, the real-time evaluation of sensor data (machine, logistics and product data) ensures that machine failures can be registered and rectified immediately. At the same time, machine learning models filter recurring patterns from the data so that machine downtimes can be specifically avoided, for example through predictive maintenance activities, which in turn saves significant costs. For example, companies can reduce the reject rate in parts production. The area of IT security also benefits from fast data analysis, and here in particular, analysis within minutes and even seconds is necessary in order to identify potential risks in good time and react to the dangers - for example in the case of DoS attacks on web services.
What is an in-memory database?
As an analytical in-memory database with a 'high performance layer', Exasol is also based on Hadoop, among other things. The solution is available as standalone software, as an appliance or as a cloud solution.
But why is an in-memory database the perfect basis for real-time analysis in industry? In short: because it is very fast. This is because an in-memory database is a column-oriented high-performance database solution that stores and processes very large amounts of data in the main memory. With the column-oriented system, a large amount of data can be compressed and written away 100 times faster than in conventional row-oriented databases. Accessing data stored in the main memory is up to a factor of 1000 faster than accessing data stored on the hard disk. In most cases, analytical in-memory databases are therefore 50 to 100 times faster overall and therefore enable well-founded real-time reactions to large amounts of data that were previously unthinkable.
In practice, for example, analyses that once took several hours can be reduced to just a few minutes or even seconds. Conventional databases are optimized for all processes on hard disks, but these are much slower than the main memory and cannot provide real-time analyses. In contrast, in-memory technology enables real-time data analyses and evaluations as well as ad-hoc reports in a matter of seconds.
In addition, in-memory technology significantly simplifies data analysis, as it is no longer necessary to aggregate basic data in advance. This is due to the fact that the in-memory database can be used to directly analyze both structured and unstructured data from various upstream systems.
The industry, which is increasingly data-driven, also offers in-memory databases the necessary degree of flexibility: the Exasol solution, for example, is easy to implement and highly scalable, as it does not require complex installations and configurations. Instead, it is characterized by automated self-optimization. On the one hand, this is essential as the mountains of data are constantly growing. On the other hand, this contributes to a significant reduction in the burden on the company's own IT resources.
Necessary investments
There is no general answer to the question of what investment is required for an in-memory solution, as the total budget to be invested depends on the individual case and the existing infrastructure. However, it is often not a question of completely replacing existing systems and carrying out cost-intensive migration projects. In-memory databases are often implemented as an additional high-performance layer that is considered instead of expensive upgrades to existing database systems. The technical requirements for efficient in-memory data analysis are straightforward:
- Powerful and efficient systems,
- a freely scalable, analytical database that can process large volumes of data and
- a business intelligence front end or visualization software to control the analyses.
When deciding on an in-memory database solution, the most important task within the IT department is to take stock of the existing IT infrastructure. In principle, big data analyses do not require a complete replacement of systems. In most cases, it is sufficient to supplement existing systems with new ones. The high-performance database can be flexibly embedded in the existing infrastructure, whether through scalability on standard hardware or through agile integration with Hadoop and other specialized databases, for example. Nothing needs to be changed to the existing infrastructure.
The main criterion when analyzing the existing infrastructure is to check the requirements for how to handle the data volumes generated in accordance with the requirements. The biggest role in this context is played by the performance and speed of the implemented systems. As a rule, traditional IT systems are completely overloaded with huge amounts of data and react too slowly. It is also important to be able to merge data from various non-homogeneous sources and to process and analyze large volumes of data in real time. Flexibility is an equally essential criterion. Furthermore, the solution should not require constant tuning and should be low-maintenance. In-memory systems reduce the workload enormously here, as there is simply no need for tedious system analysis and continuous tuning of the database configuration. While normally one or two people are responsible for a conventional database and have to constantly monitor and optimize the system, this effort is usually completely eliminated with in-memory solutions.
In addition, in-house solutions can be supplemented with cloud solutions: Even when budgets are tight, analysis projects can be implemented successfully and cost-effectively with the help of flexibly scalable software-as-a-service or cloud solutions. For example, companies without huge IT budgets can also use powerful solutions, as only a small amount of expertise needs to be built up in-house and initial investments and responsibility for all hardware and software are eliminated. Project duration and complexity can also be reduced as a result. In addition, flexible terms and scaling options in the SaaS sector offer more agility and less risk.
Therefore, the main task of an IT manager is to bring the existing systems up to speed and choose solutions that can grow with the new tasks in order to ensure long-term investment security.
Author:
Mathias Golombek is Chief Technology Officer (CTO) at Exasol.
Efficient data analysis at Semikron
Semikron also benefits from the advantages of in-memory data analysis in the production environment. The manufacturer of power electronics uses an in-memory database management solution in the area of measurement data archiving. With the data analysis, the company was able to reduce the creation of a box plot to determine the value distribution of all orders from the previous month from one week per month to one minute per month. In addition, key production figures for the products sold can be generated at any time at the touch of a button. In addition, preventive analyses and planning scenarios based on measurement data can be created particularly quickly and dynamically using in-memory technology - even ad hoc if necessary. It is also possible to call up all production records with all available data for an individual component at any time.
If the company currently manufactures power semiconductor modules and systems in Slovakia, for example, special input plug-ins ensure that the incoming heterogeneous measurement and process data is standardized and stored on an international server in a first step. In a second step, this standardized data is transferred from the international server to a central server in Germany. The data is then unpacked and transferred to the in-memory database. There it is available for analytical queries and reports at any time.
The system consists of 3+1 individual computers (nodes), which are connected to form a cluster. The data is distributed automatically in the cluster so that all hardware resources are used optimally during calculations and all information is stored in columns and not in rows during the process. The database system also automatically determines the ideal system configuration.
This saves the administrator from having to analyze the queries and manually create the required indices. This has led to the optimization of processes and workflows: "For example, we have achieved significant quality improvements in module production via the introduction of a plasma cleaner before processing the raw material. We were also able to quickly determine from the outset which raw material was used for a particular production batch and in which customer product it was further processed and sold. Since then, it has been possible to track certain critical process parameters (e.g. reverse voltage) from the chip to the finished module or system to the end customer," explains Gerhard Zapf, Senior Manager IT Application at Semikron.












