Introduction
The Alfa Bank is a medium sized bank, which has branches in Russia, Ukraine, Belorussia, Nederland, and Kazakhstan. The bank is a universal financial institute and provides all types of financial services to individuals, small and medium-sized businesses. Now the bank already has developed hardware and software infrastructure. The operating system for workstations as well as for servers is Windows. It uses Colvir software as ABS, Compass as card system, CRM (customer relationships management) software and different kinds of distance banking solutions for more efficient customer service.
Problem description
For years of bank operating a huge data amounts were collected (hundreds of terabytes). The performances of databases became slower and queries took too much time. It was decided to implement data warehouse for this problem solution. So that data of previous years will be migrated to this warehouse. Also with the use of data warehouse data from multiple systems can be integrated and used for analytical reporting, structured queries and decision making.
In banking area the situation is changing rapidly, and in these conditions banks face a lot of new serious problems. In order to be successful banks need to adapt to unpredictable and fast alterations. A lot of banks have difficulties because of heterogeneous systems, excessive functionality and resources or inadequate service level.
Evaluation
A lot of data is generated daily during bank operations, so the solution needs to be scalable. The PL SQL database is used in the bank. The data warehouse solution must be reliable and had high-quality support. The following huge vendors have own data warehouse solutions: Oracle, Teradata, IBM, Microsoft and Pentaho.
The simple data warehousing architecture looks as following:
A data warehouse is a subject-oriented, time-variant and non-volatile data collection. With the use of this data, companies can make informed decisions. A data warehouse presents consolidated and multidimensional data view. Also data mining and OLAP are usually built into data warehouse software. A data warehouse is usually kept separate from typical OLTP databases because of following reasons:
An OLTP database is built for well-known tasks like searching records, indexing, etc. Contrary, data warehouse queries are complex.
An OLTP database query allows to change data, while OLAP query needs only read data
An OLTP database stores current data. A data warehouse is needed to maintain historical data.
Oracle Database presents embedded OLAP server. Oracle OLAP is able to provide full mathematical, statistical and financial real-time analysis of historical data. Oracle OLAP is totally integrated into database, this means that standard SQL querying, administrative and reporting tools can be used. The benefits of using Oracle OLAP are ease of application development and administration, security, unmatched performance and scalability.
Oracle also has Data Mining features. Data mining uses huge data quantities for models creation. These models then can be used to extract information from a data warehouse, predict customers likely to alter their service providers, and identify fraudulent behavior. Oracle Data Mining supports classification (clusterization), regression, attribute importance, associations, and feature extraction functions [1].
Teradata Warehouse Miner supports running on RDBMS databases with the use of ODBC connections. It supports a lot of data mining algorithms as K-means, association rule mining, regression and decision tree analysis [2].
The comparison table of Oracle and Teradata solutions [3]:
IBM Warehouse solutions include IBM DB2, InfoSphere, Cognos Data Manager, Cognos Framework Manager, DataStage and Netezza. IBM Cognos Business Intelligence offers a full range of analytics reporting functions: reports, dashboards, scorecarding [4].
While Oracle uses old OLTP model, IBM Pure Data System is designed as a platform specifically for data warehousing and analytics. It employs innovative AMPP (asymmetric parallel processing) architecture. The IBM Pure Data System uses memory and CPU more efficiently, as it is adapted purposefully for data warehousing.
The comparison table of Oracle and IBM warehousing solutions [5]:
Oracle Exadata is more widely spread solution. This means more bugs were discovered here and now this product is more debugged. However, queries in IBM PureData are faster and the solution itself is cheaper.
The Microsoft Analytics Platform System meets demands of data warehouse environment with its integrated system supporting hybrid data warehouse scenarios. It gives the possibility to make queries to both relational and non-relational data by using Microsoft PolyBase and leading Big Data technologies [6].
Azure SQL Data Warehouse is a cloud, scalable database for processing large data volumes, both relational and non-relational. SQL Data Warehouse is built on MPP (massively parallel processing) architecture. SQL Data Warehouse combines typical relational database with Azure cloud capabilities, leverages the Azure platform, complements the SQL Server ecosystem.
Basically, Oracle and Microsoft both provide reliable, extensible solutions with a lot of options. Oracle offerings are more interesting to existing Oracle customers, because there are adjusted to other Oracle applications. The advantage is that all work together, but the drawback is that Oracle products are expensive and proprietary. Microsoft suggests smaller product set, but it is more applicable to small and medium sized businesses. The Microsoft demonstrates high-quality analytics technologies, like HDInsight big data and Azure SQL Data Warehouse [7].
Pentaho platform delivers analytics-ready data from all kinds of sources to end user. Pentaho suggests user-friendly drag-and-drop data integration for flat files, RDBMS, Hadoop and etc. This software provides graphical ETL (extract-transform-load) designer with which it is easy to create data pipelines, rich library of components to work with data from a lot of sources, powerful orchestration capabilities, and agile views for visualizing and modeling, integrated enterprise scheduler.
Pentaho Reporting incorporates a Reporting Engine, a Report Designer, a Business Intelligence (BI) Server. It comes stacked with the accompanying components: Report Designer, Metadata Editor, Report Designer and Design Studio, Pentaho client console web interface, Specially appointed reporting interface.
The comparison between Oracle and Pentaho:
The Pentaho solution is suitable for the case; however, it has no support in languages of countries, where the bank is located. So, there would be difficulties with this solution implementation because of misunderstanding.
Conclusion
In conclusion, the best option for the case is IBM’s solution. Its price is cheaper than Oracle and approach and technology are more innovative. Terradata runs only on Linux, that is not convenient, because all workstations and servers are Windows-based in the bank. Pentaho is generally applicable, however, its support is unavailable in the bank’s country. Microsoft Analytics is cloud solution that is not appropriate for the bank, where the security is the main issue. The IBM software is extensible and it can be tuned to different conditions. So, IBM Pure Data is appropriate in long term.
References
[1] L. Paul, “Oracle Database Data Warehousing Guide” July, 2013.
[2] Terradata Website, “Terradata Warehouse Miner” July, 2016. Retrieved from: http://www.teradata.com/products-and-services/teradata-warehouse-miner/
[3] DB-engines Website, “System Properties Comparison Oracle Vs. Teradata” July, 2016. Retrieved from: http://db-engines.com/en/system/Oracle%3BTeradata
[4] Attain Insight Website, “Business Analytics” July, 2016. Retrieved from: http://www.attaininsight.com/solutions/business-analytics/advanced-analytics
[5] P. Francisco, “Oracle Exadata and IBM PureData System for Analytics Compared” April, 2013. Retrieved from: https://tdwi.org/
[6] Microsoft Website, “Microsoft Analytics Platform System” July, 2016. Retrieved from: https://www.microsoft.com/en-us/cloud-platform/analytics-platform-system
[7] “Oracle BI vs Microsoft Power BI” July, 2016. Retrieved from: http://www.butleranalytics.com/oracle-bi-vs-microsoft-power-bi/
[8] Finances Online Website, “Compare Pentaho and Oracle BI” July, 2016. Retrieved from: https://comparisons.financesonline.com/pentaho-vs-oracle-bi