HOB RD VPN and SSL Accelerator Cards

Posted by Documentation Fri, 12 Sep 2008 14:41:00 GMT

HOB RD VPN is a remote access solution based on SSL. The SSL part is quite old, it was first developed for the 3270 data stream to securely access IBM mainframes. Later it was extended to support RDP, then HTTPS for the HOB Web Server Gate and, finally, for tunneling of PPP. The SSL part of HOB RD VPN also has the product name HOBLink Secure. We believe when HOB started the development of its SSL solution, the popular OpenSSL was not yet available.

So over time there was the question at HOB: Should we support SSL Accelerator Cards?

First I have to mention that we at HOB work with the latest servers from leading vendors. We always buy the models with the highest clock speed. The HOB development people should not waste their time working (meaning compiling and debugging) on slow machines.

We acquired some SSL Accelerator Cards as test equipment and got them working. These cards were connected to the PCI bus of an x86 system.

But the results of tests we made were quite disappointing: When calculating the asymmetric RSA key, the SSL Accelerator Cards gave some advantage over the solution in pure software. But the symmetric encryption algorithms, mainly the currently most widely used AES algorithm, did not give an advantage over the software solution. Also, one or more cards were even slower compared to the pure software solution.

The asymmetric RSA key is calculated only once, at session start. The key found may also be re-used between two partners having one or multiple SSL connections between them. So processing of symmetric encryption is far more important compared to the calculation of the asymmetric RSA key.

We believe the reason for not getting more speed out of the SSL Accelerator Cards is the following: The CPU, when an SSL Accelerator Card is used, still has something to do: it has to send all the data down the bus, then the card will process, and afterwards the data is sent over the bus again. Whether there are cards which directly access the main memory is something I don't know.

The HOB SSL suite contains all major encryption algorithms including AES with a 256-bit key length. The HOB SSL also contains, as an option, compression. Netscape defined the original SSL protocol, and in the handshake they already put in parameters for compression. So HOB included compression, but we do not know of any other vendor who included compression as well. Tests have shown, for compression there are about as many CPU resources used as compared to symmetric encryption. But the SSL Accelerator Cards do not include compression. We have seen hardware that does compression. But, when comparing the compression ratio with software solutions, the hardware compression does not compare well. The reason for that is, that in compression algorithms there is some fine-tuning possible in software. Doing the same in hardware would be too complicated. So we found out compression should be done in software as well.

At HOB, we have successfully tested the WebSecureProxy, the SSL gateway, with 10,000 simultaneous sessions; both on Windows and Linux, and using an x86 CPU on mid-size servers. No SSL Accelerator Card was necessary to reach 10,000 simultaneous SSL sessions. Each of the sessions was run with simulated RDP traffic where the user neither permanently sends nor permanently receives data from the server.

We believe, on big machines, even more than 10,000 simultaneous sessions are possible - without an SSL Accelerator Card. And HOB RD VPN supports clustering as well.

The HOB WebSecureProxy was designed with heavy loads in mind, and it can keep any number of CPUs busy without giving delay to the users.

The HOB SSL encryption routines have been examined by the BSI, the German Bundesamt für Sicherheit in der Informationstechnik. The HOB SSL encryption routines got certified under Common Criteria. In the course of this certification, HOB made a large number of compatibility tests with other SSL solutions.

If the HOB customers would use SSL Accelerator Cards, there would be extra cost for the cards. Also, as hardware can break, our customers would need spare parts.

With the HOB solution, especially with the Unix-based OpenSource operating system HOB SCS, when there is a hardware problem you go to the nearest PC dealer, get a server out of the box, install the software and the problem is fixed.

I believe today we get a lot of processing power out of modern CPUs. These CPUs have many cores today. So spending money for fast CPUs is better compared to spending money for SSL Accelerator Cards.

12.09.08 Klaus Brandstätter

Posted in | no comments |

Scaling of Servers

Posted by Documentation Fri, 12 Sep 2008 13:47:00 GMT

How many servers will we need for a certain application?

This document provides some insight into the servers necessary for WTS (Windows Terminal Services) or VDI (Virtual Desktop Infrastructure).

In the old days, companies had just a single mainframe and certain applications running on it. As CPU load generally is the primary bottleneck, the people responsible for the mainframe just looked at the CPU utilization and, when it was over 80%, they knew they had to buy a bigger one. When paging was too high they had to buy more memory.

Today we can run applications in many ways, and these ways all have different needs for computing power and memory.

First we look at desktops. As Windows Vista has been available since early 2007, most users found out running Windows Vista with less than one gigabyte of memory is quite slow.

When we have desktops, these should share their data. This is typically done over a file server. I have heard people say a modern server can handle 5,000 simultaneous users when this server is used as a file server.

How about Terminal Services? At the moment of this writing, most terminal servers are used by around 50 users. 32-Bit operating systems are mostly used; the problem with 64-Bit operating systems (Windows) is that not all drivers are available in 64-Bit versions. As companies have more users, load-balancing is used to connect all the users to multiple servers.

But Windows Server 2003 and Windows Server 2008 can handle many more concurrent users. In the paper on

http://www.microsoft.com/windowsserver2003/techinfo/overview/tsscaling.mspx

Microsoft says they have tested several hundred users on hardware dated 2003. As today we have more advanced hardware with fast CPUs, many cores and lots of cheap memory, I believe running terminal servers with hundreds of users is no problem. Of course, 64-bit hardware and software is needed in this case.

But some key points should be kept in mind: Windows, as well as other modern operating systems, does paging as a default configuration. Paging means, memory not frequently used is written to the page files on disk, so less memory is needed. What happens if a machine has enough memory so that no paging is needed? As Windows always tries to have some memory space available for when new applications are started, it still writes pages to disk and frees the internal memory as it is not needed. The problem today is that the gap between CPU speed, memory speed and disk speed always widens. And as we have more memory, the page file on disk gets bigger and it takes even longer to read in the requested memory pages when needed.

We have a big WTS in use for our developers, running MS Visual Studio and Eclipse. Our WTS is sometimes up for many months and users do not log off. In the evening they disconnect their RDP session. When they come in the next morning they start their RDP client and can continue to work where they left off the day before. Our users find that straight-forward. When we had paging enabled, and when applications had not been used for maybe several days, the users clicked on the icon in the task bar and had to wait maybe 10 minutes until the application was paged in and the users could work again. Sometimes the application did not show up any more, even after hours. All these problems disappeared when we turned off paging in the Windows operating system. Without paging, our system always immediately responds to every action users make. It's a great system!

I can give the following advice: Buy so much memory for your servers that paging can be disabled. Memory is cheap and will get even cheaper. Disable paging by configuring Windows accordingly.

As you have no more paging, you do not need fast disks any longer. It is sufficient to buy big, cheap SATA disk drives instead of buying the faster but smaller and more expensive SCSI disks.

Let us talk about VDI or desktop virtualization. Using VDI means you have servers running virtualized single-user operating systems such as Windows XP or Windows Vista.

When VDI is used, multiple copies of the (guest) operating system are simultaneously in memory, so much more memory is needed. Remember that you should not have less than one gigabyte for every Windows Vista. And remember you should have paging turned off. When running VDI, I recommend you disable paging on the host level, but paging on the guest level will still be needed (otherwise you would need just too much memory). If you have a Windows Vista system and give it only one gigabyte of memory without paging, you will not be able to start many applications.

So when we have a server with 16 gigabytes of memory, we can only have around 16 simultaneous Windows Vista guests running.

So, compared to WTS, VDI needs much more hardware resources. However, there are still occasions where VDI is more suitable than WTS. And hardware keeps getting more powerful and cheaper, so VDI, in some cases, becomes more attractive.

13.08.08 Klaus Brandstätter

no comments |

3270 - A Brief History

Posted by Documentation Fri, 12 Sep 2008 13:38:00 GMT

Recently I was surfing the Web and found something about my company, HOB, at Wikipedia. In the German Wikipedia I found something about 3270 which I believe does not cover the technical background. And also, I think there is something wrong: The article says Attachmate produced 3270 terminals in the past. I remember Attachmate produced 3270 coax cards, but never real terminals.

My name is Klaus Brandstaetter, I was born in 1954, and since 1981 I am CEO of HOB Germany, which was deeply involved in the 3270 market. As I know much about 3270 I decided to write something about 3270. As 3270 was a billion dollar market in the 70's and 80's, and 3270 is still in use today, there is much to say. I will concentrate on the important facts only, but still this will be a big article.

IBM with the /370 mainframe system was the dominant IT company in the 70's and 80's. At IBM, each product they sold had a four digit number, so the terminal they developed for the /370 mainframe got the number 3277. The first display had a very small screen displaying only 12 rows with 40 characters. The terminal had a white cabinet made from sheet metal. This was called Model one, and there was also the Model two which had the same cabinet but a bigger screen of 15 inches displaying 24 rows with 80 characters each. The characters were displayed in green on black background. Some say WYSIWYG, in the case of 3270, stands for "What You See is what IBM Gave You." Each 3277 terminal was connected over a coaxial cable to a control unit. The transmission speed of this coaxial cable was at first something around one megabit per second. Later, with the next generation, 2.3 megabits per second were achieved. A terminal did have some logic, but no CPU; each terminal, of course, had a screen buffer (memory) where the characters to be displayed were stored.

If each terminal would have been connected to the mainframe directly, each keystroke would have generated an interrupt on the mainframe. At that time, there was only limited computer power, so it was not possible to process a lot of interrupts in a given time. So IBM chose to connect each terminal to a so-called control-unit. The IBM part number of the first control unit was 3272, later 3274 and 3174 followed. The first control units had a CPU, but as memory was very expensive the CPU used the screen buffer of the terminals as memory. Commands which access the memory on the terminals were exchanged over the coaxial cable between control unit and terminal. The terminals that worked that way were also called CUT, Control Unit Terminal. Each control unit could handle up to 32 terminals, later a control unit could handle up to 128 terminals. The control unit was connected to the mainframe (channel) with a so-called Bus/Tag cable. This cable had eight parallel wires for the data and another eight parallel wires for the /370 channel command. A control-unit could also be used remotely: typically used were fullduplex lines with a bandwidth of 9,600 bits per second. Can you imagine 32 users today working over a single telecommunications line with just 9,600 Bits per second? But in the 80's this was common and the users did not complain. It is also worth mentioning that 3270 terminals, especially the 3277 one, could also be connected to an IBM /3 computer, which was less powerful that the /370 mainframe. The IBM /3 was the ancestor of what later would become the /34, /38, /36, for a long time AS/400 and what is now called the i-Series. With the IBM /3 computers, 3277 terminals were used with coaxial cable, but the successors got 5250 terminals with a more sophisticated protocol, connected over twinax cable, which is also called twisted pair.

For the 3270 terminal to work, the software in the mainframe had to use the 3270 protocol. Typically in the 3270 protocol the EBCDIC character set was used. A terminal could display up to 192 different characters, the remaining byte encodings were orders and attributes. The application in the mainframe sent either a complete screen or a part of it, mixed with orders. Such orders included setting the cursor, moving to other locations on the screen and orders to fill in attributes.

Such attributes separated the screen into different fields, a part of them editable by the user, others were protected. When the mainframe application sent data to the screen, the user could edit the page displayed and navigate around on the screen. At all this was independent of the host, the mainframe. When all the page editing was done, the user had to press special keys, either Enter, PFx (Program Function) or PAx (Program Attention), and only then was all edited text sent to the mainframe, to the receiving application.

In this way, computers with, compared to today, relatively small computer power could handle the traffic of many terminals, in some installations thousands or even tens of thousands.

This type of processing is similar to what we got with HTML / WEB 1.0, and is called transactional processing.

When we look back at the history of 3270, the first generation of terminals was 3277, 3275 and 3272 as control unit. After that, the next generation consisted of the terminals 3278, 3276 and 3274 as control unit. With the 3278 there was the model 2, displaying 24 x 80 characters. Additional new models were added: model 3 with 32 x 80 characters, model 4 with 43 x 80 characters and model 5 with 27 x 132 characters. Model 5 displaying 132 characters in a line was useful to view print output, since printers at that time usually printed lines 132 characters wide. All models had an additional line to display status information. All terminals had the capability to switch to the model 2 mode, displaying only 24 x 80 characters. There followed the color terminal 3279, which could display the characters in up to seven different colors. A special and very expensive terminal, the 3279G, could also handle graphics. Graphics were built by loading special characters with pixels in the desired colors and then addressing these characters in the data stream sent to the terminal. It should also be mentioned that as a base functionality, most of the IBM terminals had an additional, built-in character set called APL characters. APL characters also could be entered from the keyboard. APL stands for A Programming Language, and APL characters were made from the Greek Alphabet.

Besides the graphics addressable through loaded characters, IBM later added vector graphics.

In the 3270 family of terminals, IBM had many different models, for example 3277, 3278, 3178, 3180, 3179, and so on. The different generations of control units were called 3272, 3274 and at last 3174. There were also terminals with a small control unit built-in, like 3275 and 3276. Many different printers were sold, some of them with the number 3286 or 3287. The printers were connected to the control unit over coaxial cable just as the terminals. For the software, IBM tried to have printers that were programmed similarly to a terminal, sending more than one line to the printer in a chunk of the data stream.

IBM sold millions of the 3270 terminals, mainly from the 3278 model and later models. But there was also competition from other manufacturers, such as Memorex, Telex, HOB, Lee Data, Ericson, Nokia, MDS, ADI and Fujitsu. Memorex and Telex later merged into one company. Altogether there were about 15 different vendors producing 3270 terminals. Most just copied IBM's models, some had unique add-ons. Only HOB succeeded in manufacturing a 3270-compatible terminal with graphics support. Some of the vendors also built their own control unit and used a different cabling schema. The company McData was very successful building 3270-type control units.

The market for 3270 terminals was lucrative until the middle of the 90's. By this time standard PCs had already taken over the business with hardware and software emulating 3270 terminals.

Look at this market now: IBM sold the first IBM PC in August 1981 after just 9 months of development time. IBM then, for a certain time, dominated the PC market. Soon after IBM brought out the first PC, the company IDEA developed and sold a so-called Irma coax card, fitting into the IBM PC with the standard bus. This also included software running in MS DOS. Hardware and software both connected to an IBM (or compatible) control unit, and the user could do what he normally would have done with a hardware terminal. This solution also included file-transfer in both directions between the IBM mainframe and the PC.

After that, IBM also developed such a coax card, and other manufacturers did the same. The emulation software from IBM was called Personal Communications or in short, PersCom. Let us name some of the other manufacturers and their products:

Attachemate with Extra!

Wall Data with Rumba

WRQ with Reflection

HOB with HOBLink 3270, later HOBLink Terminal Edition

At the end of the 90's, 3270 emulation was a big market and about one hundred companies had developed such software.

There also are 3270 emulations written in Java, HoD = Host-on-Demand from IBM or HOBLink J-Term (Java Terminal).

Software in use where the 3270 client is connected to: IMS Information Management System CICS Customer Information and Control System TSO Time Sharing Option and many others.

Software that generates 3270 data stream with graphics includes IBM GDDM (Graphical Data Display Manager) and software from SAS.

In the beginning, the 3270 data stream was sent over simple protocols such as channel-attached or a family of protocols called BSC (binary synchronous communication). Around 1970, IBM invented VTAM and SNA, System Network Architecture. SNA dominated the network protocols in use for a long time. Nowadays SNA is only used inside of the IBM mainframes, all real network connections are done over TCP/IP. For 3270 emulations this protocol (a type of telnet) is called TN3270 or TN3270E, defined in several RFCs.

Prominent people of the IT industry had made bets that the last mainframe would be switched off December 31st, 1999. As we know, this did not happen, and mainframes are still in use by many big organizations. And the 3270 protocol is also still in use.

Some time after the year 2000, an IBM marketing campaign was saying that every day there are still more 3270 transactions than there are transactions over the World-Wide-Web.

12.08.08 Klaus Brandstätter

no comments |