Mainframe COBOL/CICS/VSAM to Java Application Server & Relational DB
TL;DR — Watch the video below to see how Heirloom automatically takes a complex mainframe warehousing application to Java in 60 seconds with 100% accuracy, guaranteed.
Migrating mainframe workloads to anywhere is hard, right?
You may have seen vendor presentations that promise an assured migration process, led by analysis tools that paint interrelationships between application artifacts that bedazzle (mislead) you into believing that the complexity is well understood. I get it. It looks good; impressive even.
It’s also blatant vendor misdirection.
What you are seeing is superficial at best. A “shiny object” that distracts you from the complexity ahead, and one that steers you towards an expensive multi-year services-led engagement that is aligned with the vendors business model, not yours.
Just ask the vendor “where does the application get deployed?”. If the answer is not “any Java Application Server“, you are being quietly led into a dependency on a labyrinthic proprietary black-box that underpins an enforced application software architecture (e.g. MVC). Any assertions that you are now on an agile, open, scalable, performant platform, die right there.
I’m not going to get into an extensive takedown (in this article) of why migration transformation toolsets that are borne of application analyzers are a hugely expensive strategic misstep because I’d like you to spend the next 60 seconds watching how astoundingly fast Heirloom is at transforming mainframe applications to Java.
Need a recap? What you saw was a mainframe COBOL/CICS implementation of the TPC-C benchmark (an application with over 50,000 LOC and 7 BMS screens) being compiled by Heirloom (without any code changes) and deployed to a Java Application Server for immediate execution via a browser. All the data for this application was previously migrated from VSAM (EBCDIC encoded) to an RDBMS (ASCII encoded). In later articles, we’ll demonstrate how Heirloom migrates mainframe batch, data and security profiles just as quickly.
Although the resulting application is 100% Java (and deployable on-premise or to any cloud), Heirloom provides full support (via Eclipse plug-ins) for on-going development of the application in the host language (COBOL in this example, but PL/I also) or in the target language (i.e. Java), or both. This was done because any transformation is not just about the application artifacts. People are obviously a big part of the IP equation, and securing the engagement of IT staff is essential to ensure a successful transformation. Not just on day 1, but for many years post-deployment.
Would you be skeptical of a claim from a vendor that stated you can take existing mainframe workloads (online and batch), and automatically transform them (with 100% accuracy) into instantly agile Java applications that can immediately be deployed to the cloud? You wouldn’t be alone. For many of our initial client meetings, there’s a palpable sense of disbelief (or, healthy skepticism if you prefer).
So, here’s another 3-minute video from Ian White, Heirloom Computing’s VP of Engineering that demonstrates that claim, using Pivotal Cloud Foundry (PCF).
What happened? We took an online mainframe application and deployed it to PCF in 3 minutes. No misdirection, real code, real results, and a re-platforming project lifecycle that puts you in control (so you can avoid black-box solutions).
For us at Heirloom Computing, Cloud Foundry is a great example of how Heirloom maximizes the power of open source stacks to provide clients with a way to include high-value mainframe workloads in strategic initiatives (e.g. cloud, digital transformation etc). One that protects existing function, but also one that is seamlessly integrated with an agile ecosystem.
There’s a lot of chatter about how to make mainframe workloads agile. I have contributed to that chatter myself. The discourse is essential. Boiled down, my assertion is that the mainframe ecosystem is foundationally not agile (and never will be). No amount of DevOps tooling, nor vendor misdirection is going to change that.
Mainframe workloads are an essential part of any digital transformation strategy, but those workloads will persist in a different form. One that protects existing function, but also one that is seamlessly integrated with an agile ecosystem.
Below is a (3 minute) video that implements the above statement. It was put together inside 2 hours by Heirloom Computing’s VP of Engineering, Ian White.
This was a mainframe application that was compiled (unchanged) to Java and executed on the cloud using Heirloom, which automatically makes the workload instantly agile (all transactions are immediately accessible as a service). Agile enough for Ian to very quickly hook it up to Alexa.
TL;DR — see picture above, or… a career COBOL’er makes a compelling argument that legacy application systems (COBOL et al) on the IBM Mainframe are killing IT digital transformation initiatives.
So, a heads-up… this article is going to be self-serving (at least to start with, perhaps longer), as I’ve come to the conclusion that it is necessary for me to “introduce” myself in an attempt to establish a greater level of credibility than I might otherwise be able to muster!
I’ve been working for over 30 years. My entire career has been in the “COBOL space”, the vast majority of it working with Global 2000 companies to deliver COBOL application development & deployment platforms that were primarily focused on adding value to the IBM Mainframe (“The World’s Greatest Legacy Ecosystem”).
I have worked at the “coal face” developing bespoke commercial COBOL applications. I have worked developing COBOL compilers and runtimes. I have led global teams of astoundingly brilliant people that have built COBOL ecosystems from scratch. Back in 2010, myself and a group of others with similar career profiles, and significantly greater areas of expertise, founded Heirloom Computing to bring a new COBOL ecosystem to market.
Heirloom leverages open-source software stacks (primarily Java); one that immediately exposes existing business rules from mainframe workloads as a collection of Java interfaces and RESTful services so they are immediately available to other applications; one that from day 1 is absolutely guaranteed to accurately retain existing business logic, data integrity, and security profiles; one that allows application developers (using Eclipse) to continue in COBOL, or Java, or both, so IT can “iterate away” from a constrained model to an agile one, at a pace that is determined by their own unique business drivers. This approach removes the “re-platforming” risk and makes the workload instantly agile.
We did this because we believe (and our investors and customers have validated) that IT needs to get beyond decades-old legacy systems if they are going to compete in a digital world.
Credibility enhanced? Either way, on we go…
The IBM Mainframe is without a doubt (and by far) “The World’s Greatest Legacy Ecosystem”. It’s reliability, pervasiveness, and keeper of systems of record is unmatched. Today, however, that proud legacy is increasingly burdensome. These (crucial) systems: are severely & systemically constrained (and today, agility really matters); have paralyzed IT with a (fearful non-viable) “do nothing” strategy which consequently inhibits execution of strategic initiatives (like digital transformation) that are needed to compete. And up to this point, we’ve not even mentioned the operational expense nor the risks of an ever aging/depleting skills pool.
Some of these systems, especially in government, have eroded/warped to the point that paper processes have been introduced to integrate legacy workloads with new services! This is NOT a failure of DevOps, nor tooling, but a failure of leadership and the brutal reality that mainframe systems of record are inherently NOT agile because a) they were never designed that way, and b) the COBOL ecosystem itself (an archaic compute-model, a procedural language, a failure to embrace open source, a lack of application frameworks, an entrenched culture, …) is NOT agile.
In article, after article, after article, IT leaders and analysts have clearly identified the challenge. Progressive enterprises like GE and Capital One are already working on solutions. Mainframe workloads are an essential part of any digital transformation strategy, but those workloads will persist in a different form. One that protects existing function, but also one that is seamlessly integrated with an agile ecosystem.
Not 3 words you’d immediately assemble together, but that’s exactly what Senior ComputerWorld Editor, Patrick Thibodeau, did yesterday.
His article was prompted by a White House announcement of an “Office of American Innovation” to oversee the modernization of federal IT.
The article then goes on to give Compuware a platform to launch a somewhat bizarre defense of COBOL, as if somehow, wrapping COBOL applications up in DevOps methodologies makes them agile, and consequently, the mainframe can be seen as (according to Chris O’Malley, Compuware’s President/CEO) “… a working environment that looks exactly like Amazon (Web Services)”.
No. It’s not. There’s no amount of makeup that you can apply to my face to make me look like Brad Pitt. Fundamentally, all the required structures for that transformation just do not exist.
There’s much to applaud with Compuware’s mission to modernize and retool the application development lifecycle on the mainframe and impart valuable new skill sets to a workforce that has been largely isolated from considering different approaches to the art of application development. However, beyond that DevOps veneer, you are still working with COBOL. If that’s where you want to be, go for it.
As Shawn McCarthy, an analyst at IDC said later in the article: “… the challenge with older COBOL systems is that many were not designed to be extensible and everything that needs to be done has to rely on custom code”.
And that’s essentially why no matter how much makeup you apply, COBOL systems on the mainframe will never be truly agile. Instead, for as long as they persist, they will continue to be an increasingly burdensome anchor that will slowly but surely impinge on an enterprise’s ability to compete.
COBOL defined business software development for decades. Now, is it over the hill or just hitting its prime?
Elastic COBOL is part of Heirloom Platform-as-a-Service (Paas), an application development toolset that is a plug-in to the Eclipse IDE framework. Elastic COBOL allows mainframe applications (including CICS and JCL) to execute as Java applications. You can continue to develop applications in COBOL or in Java, or both, enabling the transformation to Java to occur at a pace that is optimal for your business.
You can download Elastic COBOL for free. It is available on Windows, Linux, Mac OS X, Raspberry Pi and the cloud. That’s right — Raspberry Pi. So you can get out there and build an enterprise accounting system on a platform that lives in an Altoids tin.
As with so many of these compilers, Java (rather than machine code) is the target. People will argue about whether that’s a good thing or not, but the fact is that it makes the compiler much simpler to write and maintain. So get out your soldering iron, dust off your COBOL, and get your Altoid tin running.
Heirloom uses patented compiler technology to automatically transform mainframe applications into highly extensible Java source-code, with 100% accuracy, while guaranteeing the preservation of existing business logic. Read more…
I’ve been involved with COBOL for most of my professional career. It is a language that has many unique characteristics, not all positive. Loved by few and (unfairly) vilified by many, it has persisted because it is extremely good at what it was built for — encapsulating business rules.
Many of you who have experience with the COBOL eco-system will appreciate the quiet reality of the absolute dependence that we all have on it as we proceed through our working day. The rest of you will likely be somewhat perplexed that anyone even uses COBOL today, and no doubt bemused by the bold assertion that your daily life without COBOL would result in unadulterated chaos. Well, despite the many predictions over recent decades of COBOL’s demise, this reality is not going to change anytime soon. That said, it would be remiss of us to not acknowledge the strategic intent of enterprise IT to convert COBOL to Java.
For typically risk-adverse enterprise IT organizations, moving beyond COBOL is a tricky proposition. These applications represent the competitive differentiation of the business. They are the operational and transactional backbones of the business. They are the definitive manifestation of “mission critical”. The thought of rewriting or replacing the high-value trusted business processes embedded in these systems can induce violent shudders of apprehension.
For server-side transaction processing, Java is often (if not already) the strategic platform of choice of enterprise IT – and even in the cloud, many PaaS providers have adopted Java as a supported engine (e.g. Amazon’s Elastic Beanstalk, Oracle Cloud, Google App Engine, and yes, even Microsoft’s Azure). Our take on why targeting the Java platform makes so much sense for enterprise IT, comes down to 4 key benefits:
1. The ability to deploy and extend applications on an open/strategic platform that is proven and trusted for high-transaction workloads that demand performance, scalability, reliability, security and manageability.
2. Consolidation of application infrastructure to a single platform. No need to deal with multiple platforms on multiple operating systems.
3. Strategically positions applications for the cloud. Many enterprises have already made a strategic commitment to the Java platform. It’s a smart move — Java has already established itself as the de facto execution engine for the cloud.
4. Improves the productivity and agility of the development organization by modernizing skills, methodology and process.
Elastic Transaction Platform (ETP), Elastic Batch Platform (EBP) and Elastic COBOL runtime environments are designed for running enterprise applications migrated from a mainframe environment and to scale and be highly-availabile in public or private cloud environments. The system is architected to scale-out across a private or public infrastructure-as-a-service (IaaS). As more power is required additional virtual machines are brought on-line or physical resources (nodes) allocated on the framework and are assigned additional transaction (ETP) or batch (EBP) workload. The scale-out model is in contrast to most other systems in this class (with the notable exception of IBM SYSPLEX) which require scale-upwithin a frame to increase power. In that model, additional CPUs within the same physical or virtual system are required to increase power. MP Factors limit the scale-up model … as more CPUs are added synchronization effects among the cores reduces the power of each additional CPU. The scale-out model involving ETP compute nodes and shared-nothing database nodes minimizes the synchronization among nodes to higher level components … such as page or record level locks rather than words within CPU cache lines.
Further, the pay-as-you-go cloud infrastructure model extends to the Heirloom ETP system. In sync with the op-ex business model, an organization may subscribe to CPU-core-hour power as needed. Only the amount of power actually processing workload is charged-for in this model. With the alternative scale-up model, organizations must “plan for peak” since a physical machine must be configured for a certain number of CPU cores when built and transaction systems are charged on this maximum power available to it.
Each compute node in the Heirloom transaction environment is an Enterprise Legacy platform-as-a-service (Heirloom PaaS) built on top of the public/private IaaS. The online transaction processing (OLTP) environment of ELPaaS starts with user application programs (transactions) written to the IBM CICS transaction application program interface. The Elastic COBOL compiler translates user COBOL programs into Java code so that they run in common Java Virtual Machine environments. One of these is the Java Enterprise Edition (JEE) Application Server environment. The Elastic Transaction Platform coordinates transactions and implements CICS features and functions such as journals, transient data queues, and distributed program links (DPL) to other ETP regions. ETP runs under control of the JEE server because all user transactions are packaged in Enterprise Archives (.ear files) as Enterprise Java Beans (EJBs). When ETP coordinates DPL communication between the nodes it does so through EJB-to-EJB communication protocols such as EJBD, IIOP and RMI, the same protocols used in other Java enterprise application environments such as IBM Websphere, Oracle Weblogics, Red Hat JBOSS or Apache Geronimo.
User transaction code interacts with CICS file I/O (indexed-sequential file access) API and/or embedded SQL relational database access to store transactional state. For file I/O, ETP maps the application’s COBOL record structure to columns and tables in a SQL database. The programs database requests are sent off to the cloud-based SQL database nodes. Figure 1 shows the multi-node ELPaaS and database architectures.
Fig. 1. Structure of transaction COBOL applications in an ETP environment.
In order to compute the cost of the cloud scale-out model relative to the original system, we must find the rough equivalent of an IBM mainframe of a certain known size. The basic process of Performance Benchmarking is to run equivalent workloads at similar speeds on both platforms, the IBM mainframe CICS and the mainframe-migrated ETP environment.
Initially linked to the clock speed of a processor, the mainframe MIPS (millions of instructions per second) ratings have been expanded to mean the power to execute a certain transaction workload. Machines with differing I/O subsystems, clustering interconnections and memory configurations can generate different MIPS ratings independent of the CPU speed. MIPS ratings of a particular platform may also be affected by software changes to the underlying system since improvements in database searching algorithms, for example, will improve the component score on that benchmark.
IBM provides other performance information in its Large Systems Performance Reference (LSPR). The LSPR changes over time based on how IBM believes their customers are using their systems. Beginning with the introduction of the z990 in 2003, IBM has changed the mix of the benchmarks to include an equal mix of (1) a traditional IMS transaction workload, (2) a workload that includes a traditional CICS/DB2 workload, (3) a WebSphere and DB2 workload, (4) commercial batch with long job steps, and (5) commercial batch with short job steps. Eventually, these configurations are related by third parties back to traditional MIPS ratings. IBM also uses them to generate a metric for Millions of Service Units (MSU), a rating that defines software license costs for various mainframe configurations.
Transaction Processing Benchmarks
Since approximately half of the LSPR can be attributed to transaction benchmarks, it is appropriate to look at transaction processing benchmarks. The transaction processing and database community utilize a series of benchmarks by the Transaction Processing Performance Council (TPC) to measure relative throughput of a system. Since 1993, the TPC-C online transaction processing benchmark has been used to demonstrate the effectiveness of hardware platforms, database systems and transaction systems. One problem with the benchmark has been its popularity – it is now featured in Wall Street Journal advertisements when new records are reached.
TPC-C has become a highly tuned benchmarking vehicle to demonstrate a hardware or software vendor’s effectiveness of their product. And, that is the problem; the benchmark cannot really be used to show the relative strength of one platform against the other because each benchmark run is designed to meet different goals. One of the current performance leaders uses the figures to promote its UNIX database and transaction engine by offloading the business logic to 80 Windows PCs. Another TPC-C performance results show the database vendor stripping the benchmark of its business logic component, choosing to implement the whole of the transaction in procedures embedded in the database itself. The so-called clients merely “kick off” the transactions to execute within the database stored procedures. The TPC-C specification deals with these variations by forcing vendors to compute a total cost of ownership of the entire processing environment. So it means that one person’s TPC-C is a distributed processing benchmark where another’s is a database-only benchmark – the cost is roughly the same.
Not enough attention has been paid to keeping the implementation of the TPC-C benchmark static while only varying the hardware and software components. The TPC-C specification as it was written to mimic the data processing needs of a company that must manage, sell or distribute a product or service (e.g., car rental, food distribution, parts supplier, etc.). Although the TPC-C does not attempt to show how to build the application, the guidelines discuss a general-purpose set of order entry and query transactions, occasional “stock” manipulation and reporting functions, with varying inputs (including artificially injected errors) that play havoc on various systems. The TPC-C benchmark was written in such a way that as the simulated company’s business expands, new warehouses and warehouse districts increase as the workload generated by the expanding customers increases. In order to achieve an increase in transaction rate you must add data entry personnel, according to the spec.
The TPC-C specification lays out very simple steps to achieve the end-result of the five transaction types. Each transaction reads a half dozen or more records for various database tables, analyzes them, adds in simulated user input, and updates one or more database records. Each transaction begins from the end-user (data entry personnel) perspective from a menu screen. From there, the benchmark asks to simulate a never-ending session that chooses the 5 transaction types in random fashion with a certain weighting that will carry out a “new order” process about 43% of the time. It is these “new orders” that eventually dictate the transaction benchmark figure, measured qualified throughput (MqTh) specified in transactions per minute (tpmC).
The simple tuning database vendors often perform at this point reorders the list of actions in the specification that define a transaction. It might be to reduce lock contention or the time database records are held with exclusive locks. After this, the hardware or software vendors tune the system to make a point of stressing their component over others that are more or less involved in the benchmark, as a means of meeting the benchmarking criteria.
Heirloom Computing wrote a COBOL implementation of the TPC-C benchmark as a measure of workload but left it essentially untuned from the original specification. The TPC-C spec indicated (although did not mandate) that the application code, transaction control and data control reside on a single system and that network-attached users communicated with the application through a screen interface over a networking protocol. Our transaction benchmark environment maintains this relationship among components because it duplicates application environments that have been used on the mainframe for years and are now subject to migration from the mainframe. The spec defines the user interface as a series of screens. In the COBOL TPC-C the screen interface is handled through the use of CICS BMS maps that lay out the 24-line, 80-column, 3270 terminal screen used for data entry and report generation. The Heirloom technology maps these screens to RESTful Web services — XML or XHTML transmitted over HTTP protocol.
To complete the benchmarking system, a driver system that injects transactions at a rate and with the proper data defined in the TPC-C spec. The injection system issues HTTP requests and analyzes the return “screen” to generate the request for the follow-up transaction. Each driver simulates hundreds of users issuing transactions to the system. Multiple drivers are started to scale the benchmark workload against the system under test.
The TPC-C benchmark was run under the Heirloom transaction architecture of Fig. 1. The cloud scale-out model allows the number of ELPaaS virtual machines (ETP nodes running TPCC) and DB nodes to be added as the work increases. For the benchmark test this was increased from one to two ETP nodes and DB nodes increased from 2 to 4 nodes. The VMWare in-memory database SQLFire was used as a scaleable database for the benchmark. A total of 4 transaction injection nodes were required to pump work into the system. Fig. 2 shows the overall benchmarking system and the results from an hour run.
Fig. 2. The benchmarking system and results running in an EMC vBlock hadware environment.
The result of the benchmark run in fig. 2 shows that a mainframe equivalent of 1100 MIPS were processed from two ELPaaS nodes running in Linux virtual machines, consisting of 2 CPU cores each. Apache Geronimo 3.0 served as the JEE server hosting ETP and the TPC-C application code structured as CICS regions. The 1100 figure is about double the single-node case. The architecture will scale up accordingly as either CPU cores are added to the compute nodes or the number of nodes are increased.
Scaleability and Availability
Scaling out performance across multiple nodes requires load balancing component. Such a component may be a commercial load balancer from the likes of VMWare/EMC or embedded in network routers. When under control of the Heirloom Elastic Scheduler Platform (ESP) these nodes can be started and stopped on demand. Figure 3 shows these systems being power-up and down to handle the workload at the time.
Fig. 3. The private cloud deployment Infrastructure-as-a-Service (IaaS) environment
In addition to scaling out to arbitrary performance levels, the multi-node model also supports high availability. Since the user application code in most mainframe systems is written to the pseudo-conversational model there is no transaction state that must be maintained in the ETP or JEE EJBs between requests. Any node in the system may fail (hardware or software) and the load balancing component or injection system will re-route the request to a surviving node. The database nodes are similarly configured as highly available in that SQLFire ensures that database records and locks are replicated across multiple DB nodes.